A Customized Viewer for DocumentCloud

This post is for newsrooms using DocumentCloud, the fantastic document viewer developed by journalist-programmers at ProPublica and The New York Times.

Want a custom viewer for your site's documents? You can have ours.

I built it so that once set up, this viewer will automagically fill in the title, source and "back-to-article" link based on information already associated with the document -- so one file serves all of your documents.

Here's how.

One-time Setup

You can make this work with a little knowledge of html and access to a web server. You'll need to host a single html page, called dc.html and a tiny javascript file, called jquery.url.min.js.

1. Download the html code for dc.html by right-clicking on this link (or view it here).

2. Use any text editor to edit the path to your logo image on line 101. (A logo that's 60 pixels high works well).

3. On line 101, change "www.wnyc.org" to your site's home page

4. Upload the file dc.html to a web server.

To extract the document info from the URL, the page uses a little JavaScript program called jquery.url.min.js which you can read about here and download here. Once you do:

5. Upload jquery.url.min.js to your web server (the page assumes it's in the js/ subdirectory)

6. If you need to change the location of jquery.url.min.js, edit the path on line 38 of the html code and re-upload.

Using the Viewer

To use the viewer, simply construct a link to it that combines dc.html's location and the ID of the document you want it to load. For example, the base URL for the WNYC's version of dc.html is here:

http://project.wnyc.org/documents/dc.html

And the document I want to display is here:

https://www.documentcloud.org/documents/11275-bill-a11354.html

I combine them into a new link by taking the base URL, adding "?doc=" and then adding the document ID -- which, here, is 11275-bill-a11354 (omitting the .html .) Like this:

http://project.wnyc.org/documents/dc.html?doc=11275-bill-a11354

Voila.

Pages and Annotations

For extra trickiness, you can jump to specific page numbers and annotations by adding references to them into your link. Here you need to append "#document/p" and the page number. So for page 2, you'd use:

http://project.wnyc.org/documents/dc.html?doc=11275-bill-a11354#document/p2

And for the annotation on page 3, it would be:

http://project.wnyc.org/documents/dc.html?doc=11275-bill-a11354#document/p3/a3975

(You get the annotation number -- and the whole phrase after the #, actually -- by clicking on the little "link" icon next to the annotation's title.)

That's it.

Credits and Disclaimers

The base design is built on code the Chicago Tribune News Apps Team wrote, which I modified with help from the DocumentCloud folks to dynamically take up the title, source and related-story information from the document's metadata.

Note that the version of dc.html at project.wnyc.org contains extra tracking code specific to our servers. The version here does not. It's the one you should download.

And I don't warrant in any way that this is perfect code, so please use at your own risk.

If you modify it -- especially if you improve on what's here -- please let me know and I'll share the updates here and on GitHub.

8 responses

Is there anyway for the application to make the book go to full screen mode?

— semiicold

I'm not aware of *full* full-screen mode in DocumentCloud. I'm pretty sure there's a way to hide the right column, which would get you close. I'd have to dig a little for how, tho.

— John Keefe

:) Thank you for the fast reply.

Build custom document cloud viewer using another great walkthrough from @jkeefe, and based on @tribapps code #rocknroll

— Chris Keller

thanks for sharing the template. i just put it to use: http://www.tampabay.com/news/business/banking/article1192715.ece

— wmhiggins

If I add the google analytics code to the dc.html, will GA track the dc.html, or will it make a different ttrqack for each document url like this: MySite\dc.html?doc=MyDCdocument

— Carlos Osorio

John, We are using the viewer you designed and we find it is great. Question though. We have a data field (not source, nor title, nor description) that we call “Classification.” We want to display it in the viewer instead of the title. We tried modifying a line in the code within the html that instead of getting the title would read: var document_title = viewer.api.getClassification(); It does not work. Do you have suggestions as to how to do this? We appreciate your time to this. Thanks in advance for your help. Carlos Osorio National Security Archive

Does this also display the logo when using the embedded document viewer in an article, or does it only display in the full-page Document Cloud template?

— Jonathan H.

johnkeefe.net

journalism, data & diy hackery

A Customized Viewer for DocumentCloud