University of North Carolina School of Government

A Book Importer for the 21st Century

The Challenge

The University of North Carolina School of Government (UNC SoG) needed to create a more efficient way of getting its publications online to reach a larger audience.

Our Approach

Our team used Mammoth.js to convert UNC SoG’s Microsoft Word Documents into HTML to create a microsite experience for each publication.

The Results

A custom-developed book importer tool that has become so integral to the school's plans for growth that, in our client's words, has "become a staff member for the UNC SoG team."

Recognizing a Problem

Established in 2001, the University of North Carolina’s School of Government has become the largest university-based government training, advisory, and research organization in the United States. Each year the school’s faculty members publish approximately 50 books, manuals, reports, articles, bulletins, and other print and online content related to state and local government. These publications help North Carolina’s public officials and citizens understand the state and local government. However, UNC struggled with getting their content into the hands of their customers. Their digital library was hidden behind a cumbersome website, and the publications themselves were Microsoft Word Documents, which is not a format conducive to those who need easy access to information on the go.

UNC SoG partnered with Savas Labs to collaborate on the creation of a tool that would help truly digitize their content. 

Collaborating on a Technical Solution

Before we started working on a technical solution, the project began with us conducting a discovery phase that included a proof of concept. This process allowed us to build trust with UNC-SOG and helped mitigate any potential risks. We also learned that UNC-SOG’s top priority was not to disrupt their current business processes but rather to build something that would seamlessly marry their publications and IT departments. If we were successful in doing this, UNC-SOG would finally be able to launch digital books alongside the print copies faster and more efficiently. 

After much collaboration, the unanimous decision was to create a book importer tool. UNC-SOG’s developers would use the importer, and it would allow them to convert the Word documents into HTML for a microsite experience for each publication. HTML was the chosen format for digital publications because of its responsive nature. It would also allow customers to access the books from their mobile devices, tablets, police scanners, etc without the hassle of needing PDF readers or zooming in and out to read the content. 

Transforming Word documents into HTML

Once the project details were sorted out, our development team got to work. The importer tool was built using React and Next.js. We used Next.js’s React front-end (both static and server-side rendered pages) and the built-in Express.js-like API routing system. Next.js would allow the UNC-SOG developers to upload any number of Word Documents through a drag-and-drop-like interface.

From there, we needed to ensure that the specific word styles used by UNC-SOG for their published books would match their digital counterparts. With this in mind, our developers created a custom API that translates the stylized Word documents into semantic well-structured HTML and CSS that would be usable on the web.

To ensure the digital books would have the correct Word styles, our developers designed a feature on the importer tool that would require users to run the Word document through the API twice. In addition, the API displays warnings to provide additional information about the Word Styles that didn’t have corresponding HTML/CSS set up yet. This step was a considerable value add for the UNC team and a big win for us as it allowed both teams visibility into future updates to the tool to expand its capabilities with different styles. 

Second, to avoid costs on the tool’s hosting platform (Heroku), one of the static pages used a file upload library called Filepond to upload Word documents directly to an Amazon S3 bucket (a cloud storage space for the books), bypassing the local server. The benefit of this is both performance and cost – you pay Heroku for how much processing time you use, so uploading straight to S3 would save the school time and money.

After uploading the Word Documents to S3, we used a MammothJS library to convert Word documents and the custom Word styles applied during the print-preparation process into semantic, styled HTML and CSS.

UNC Image 2

Extended for the Future

Throughout the project, one goal that united both the UNC-SOG team and our team was the intention that their team would be able to take over this tool so that future books could be successfully converted and released with ease.

To ensure a smooth project handoff, our team created in-depth documentation and training videos to set up the UNC-SOG developers for success.

Curious about the process?

Want to learn more about the process that both UNC and our team took to get this project from kernel to completion? Watch our on-demand webinar.