ACAP Validator
Automated Content Access Protocol (ACAP) is a new way for publishers to communicate access and usage permissions information to search engines and other automated digital content consumers. ACAP Validator is a new software tool that can help you create valid ACAP content access files with ease. It provides clear diagnostic messages for every invalid line in an ACAP file, helping you pinpoint problems and understand how to fix them.
How ACAP works
ACAP is designed so that its content access rules fit into your robots.txt file, making it easy for search engines and other crawlers to find and parse ACAP rules. To avoid conflict with robots.txt rules, all ACAP data lines start with the ACAP- prefix. For instance, here’s a sample set of ACAP rules:
ACAP-qualified-usage: long-term-preserve preserve time-period=permanently
# Permissions for crawlers in general
ACAP-crawler: *
ACAP-usage-purpose: *
ACAP-allow-crawl: /
ACAP-allow-index: /public
ACAP-disallow-index: /
ACAP-allow-(long-term-preserve): /public
ACAP-disallow-(long-term-preserve): /
Let’s walk through this line by line.
ACAP-qualified-usage: long-term-preserve preserve time-period=permanently
This line defines a shorthand representation, long-term-preserve, for a longer content access expression. ACAP content access expressions are both flexible and precise - this one can be used to designate content that is available for permanent storage in digital archives.
# Permissions for crawlers in general
ACAP-crawler: *
Like the User-agent tag in a robots.txt file, ACAP-crawler signifies that the subsequent lines apply to any automated crawler.
ACAP-allow-crawl: /
The ACAP-allow- tag takes a content usage type as a parameter, in this case crawl. Taken as a whole, this field says that the entire site can be crawled.
ACAP-allow-index: /public
In this case, the parameter index applies only to objects accessible under /public. Say that the robots.txt file appears at the URL http://copyrightlabs.com. This rule would allow search engines to index http://copyrightlabs.com/public/permitted.html.
ACAP-disallow-index: /
No other files may be indexed. http://copyrightlabs.com/forbidden.html cannot be indexed. The more specific rule above still applies, though. In ACAP files, the more specific rule overrides rules that are less specific.
ACAP-allow-(long-term-preserve): /public
In this case, the parameter to ACAP-allow- is the qualified usage we defined above, in the first line of the file. The parentheses in (long-term-preserve) indicate that the parameter is a locally-defined shorthand expression, instead of a standard type.
ACAP-disallow-(long-term-preserve): /
Again, the more general disallow indicates that the usage type is not allowed for anything that isn’t allowed by the more specific rule above.
Let’s run this file through AcapValidator.
——————–
File parsed.
——————–
Local definitions:
[”long-term-preserve”, [”preserve”, [”time-period=permanently”]]]
Resource sets:Record: *
Subrecord: *Allow: allow_crawl/
Allow: allow_index/public
Allow: allow-(long_term_preserve)/public
Disallow: disallow-index/
Disallow: disallow-(long_term_preserve)/——————–
No errors.
——————–
So that was a valid ACAP rule set. Chances are, the first ACAP file you write may contain errors. Let’s add some rules with problems, and see what we get.
ACAP-allow-(short-term-preserve): /public
ACAP-disallow-(short-term-preserve): /
We’re adding permissions for a new qualified usage that we haven’t defined. Let’s see what AcapValidator has to say.
——————–
File parsed.
——————–
Local definitions:
[”long-term-preserve”, [”preserve”, [”time-period=permanently”]]]
Resource sets:Record: *
Subrecord: *Allow: allow_crawl/
Allow: allow_index/public
Allow: allow_(long_term_preserve)/public
Allow: allow_(short_term_preserve)/public
Disallow: disallow_index/
Disallow: disallow_(long_term_preserve)/
Disallow: disallow_(short_term_preserve)/——————–
Line 10: Local definition does not exist: short-term-preserve
Line 11: Local definition does not exist: short-term-preserve
——————–
AcapValidator keeps track of the local definitions and makes sure we don’t use anything we haven’t defined. We can fix this problem with one more line added to the top of the file.
ACAP-qualified-usage: short-term-preserve preserve time-period=30-days
These examples offer a glimpse into the richness and flexibility of the standard. Take a look at http://www.the-acap.org later this month after the standard is released.






