How to Convert CollectionBuilder-CSV to Oral History as Data
This guide shows you how to convert an existing CollectionBuilder-CSV repository into an Oral History as Data (OHD) repository.
When You Need This
- You have an existing CollectionBuilder-CSV project you want to convert
- You want to add transcript visualization and analysis features
- You need to maintain your existing metadata while adding OHD functionality
Step 1: Download the OHD Template
- Access the OHD template repository
- Go to https://github.com/oralhistoryasdata/template
- Click the green “Code” button
- Select “Download ZIP”
- Extract the downloaded ZIP file
- Save it to a location where you can easily access the files
- Extract/unzip the file to view its contents

Step 2: Copy Files to Your CollectionBuilder Repository
- Open both folders
- Open your existing CollectionBuilder-CSV repository folder
- Open the extracted OHD template folder
-
Replace the following files and folders
Action File/Folder Notes Replace _data/
folderContains OHD-specific configurations Add _includes/transcript/
folderContains transcript visualization templates Add _sass/ohd.scss
Styles for transcript visualizations Add _layouts/visualization.html
Layout for visualization pages Add _layouts/item/transcript.html
Layout for transcript items Add _layouts/home-cover.html
Cover page layout Replace _layouts/default.html
Adjusted layout base for OHD options Add pages/visualization.html
Visualization page template Replace pages/about.md
Remember to edit with your project info Replace pages/index.md
Change layout to “home-cover” Replace assets/css/cb.scss
Updated styles for OHD Add assets/data/filter-facets.csv
Data for presenting count of filters Add assets/data/filters.csv
Data download for filters.csv used in project Add assets/data/transcript-collection.json
All transcripts in one big JSON file Replace _config.yml
OHD configuration file Replace readme.md
Edit to reflect your project Replace .gitignore
Removes line that excludes “objects/” from git Add rakelib/generate_json.rake
Helper for generating transcript JSON

Step 3: Update .gitignore File
- Edit the .gitignore file in your repository
- Open
.gitignore
in a text editor - Find the line that excludes “objects/” at the end of the file
- Delete this line completely
- Save the file
- Open

Step 4: Update Your Metadata
- Check your metadata format
- OHD requires specific metadata fields for transcripts to work properly
- Open the
_data/demo-ohd-metadata.csv
file from the OHD template to see the fields fields:- Required:
objectid
: Unique identifier for each item that MUST match the filename for the transcript csv!title
: This could be just the name of the interviewee, or something like “Oral History Interview with Jane Doe”display_template
: Should be set totranscript
for oral history/interview items- Optional:
- interviewee: Name of the person interviewed
- interviewer: Name of the person conducting the interview
- pdf: Path to transcript pdf file
- PDFs will be automatically generated using PagedJS but if you’d rather link to a prepared PDF file, list it here
- Can be an external link, or a relative link to a PDF stored in the repository
- bio: Biographical information about the interviewee
- You can use Markdown to style the bios.
- Adapt your existing metadata
- Update your existing metadata CSV to include required OHD fields
- Ensure your transcript files match the names in your metadata
Step 5: Add Your Transcript Files
- Prepare your transcript files
- Transcripts should be in plain text format with specific formatting
- Each speaker should be indicated by their name followed by a colon
- See the OHD template examples for formatting guidance
- Place transcript files in the correct location
- Add your transcript files to the
objects/
directory - Ensure filenames match those referenced in your metadata
- Add your transcript files to the

Step 6: Test Your Conversion
- Run the site locally (if possible)
- Use Jekyll to build and preview your site locally
- Check that visualizations and transcripts display correctly
- Commit and push your changes
- Commit all new and modified files to your repository
- Push the changes to GitHub to update your site

Common Issues and Solutions
Problem | Solution |
---|---|
Visualizations not appearing | Ensure transcript files are properly formatted and located in the objects/ directory |
Missing metadata fields | Check that you’ve added all required OHD fields to your metadata.csv |
Styling issues | Make sure all CSS files were correctly copied over |
Build errors | Verify _config.yml was correctly replaced and contains valid YAML |
Next Steps
After converting your repository:
- Customize your site’s appearance by editing the theme settings in
_config.yml
- Add more transcripts and metadata as your collection grows
- Explore the visualization features by clicking on different transcripts
For more details on customization options and features, see our theme settings guide and visualization guide.