What is Metadata? Why is Metadata Important?

Metadata is often described as “data about data,” but that definition is not super helpful! Instead, think of metadata as a consistent way to describe and provide context for a particular item. In the case of community archiving, these items would be photographs of artifacts and recorded oral histories or interviews with community members. At a minimum, when this information is known, metadata reflects information about the item (i.e., description, subjects covered, etc.), when the item was created, and who created it.

Meaningful metadata relies on a combination of “natural language” descriptions that reflect community-focused language and “controlled vocabularies”, which are structured sets of descriptions set up by large organizations to make search terms standard across many web sites.

The best source for metadata community contributions to your archive will come directly from the person making that contribution.

How do I create metadata with my community?

Community elements of metadata are important for describing things in “natural language” – colloquialisms, code-switching, cultural-specifics and more. Sometimes, “natural language” terms align with “controlled vocabularies,” which are library-focused tools like thesauri (Getty Thesaurus of Art & Architecture , Geonames, etc.) or authority lists like Library of Congress Subject Headings or Name Authority files that allows us to standardize how we describe items. When possible, include both “natural language” and controlled language in your metadata creation practices.

Community conversations about object descriptions

Rather than rejecting institutional frameworks entirely, community-generated metadata offers another approach that rebalances authority, values lived experience, and creates a more liberatory, participatory archival process. This approach can be used alongside conventional standards when appropriate.

Here’s a recommended community-centered process for collecting metadata collaboratively. Note that these prompts are also included in the oral-history section of the Digital Arc toolkit, for use in planning the oral history interviews.

  1. Start with shared values and relationship building
    1. Build trust, explain what metadata is in plain accessible language, discuss why it matters, and co-define goals for describing materials
  2. Develop a collaborative metadata form using the suggests key fields and prompts:
    1. Title
      1. Prompt: “What would you call this item?”
      2. Intention: Community defined naming
    2. Description
      1. Prompt: “Tell us the story behind this.”
      2. Intention: Centering personal meaning
    3. Date
      1. Prompt: “When was this from? Approximate is okay.”
      2. Intention: Valuing memory over precision
    4. People
      1. Prompt: “Who is in this photo/story?”
      2. Intention: Naming and honoring individuals
    5. Place
      1. Prompt: “Where did this take place?”
      2. Intention: Grounding materials in place
    6. Emotions/Significance
      1. Prompt: “How does this item make you feel?”
      2. Intention: Centering effect and cultural value
    7. Permissions
      1. Prompt: “How should this item be used/shared?”
      2. Intention: Honoring agency and consent
    8. Tags/Keywords
      1. Prompt: “What words would help people find this?”
      2. Intention: Enabling community-generated vocabularies
  3. Suggestions for Facilitation
    1. Get in pairs
    2. Listen deeply, write down stories, ask clarifying questions
    3. Treat storytelling as metadata
    4. Record audio or video if that captures richer metadata than text alone

Other Sources for Metadata

Sign-In Form

The best source for metadata about a photograph of an artifact or an oral history comes from the person contributing those objects and stories to your online community archive. If you used the sign-up form as part of a community sharing-collecting event, you will have collected some metadata from the community contributor that you can bring over to the metadata spreadsheet:

  • Title
  • Description
  • Significance

Interview/Oral History

During the oral history in which community contributors may discuss the artifact photographed for the digital archive, you will gather other important bits of information:

  • Date of the artifact
  • Creator of the artifact

If the interview/oral history is not focused on an artifact, but rather focused on storytelling, the stories shared will provide information that can make up the metadata for the interview/oral history:

  • Topics covered (“tags” or controlled subject headings)
  • Information about the interviewee (biographical, historical, etc.)

Follow-Up Research

If you have time, you may conduct additional research through reference services available at your local public library or history center. Your local college or university may also hold related archives.

What Apps do I need to create and manage metadata?

  1. Google Sheets
  2. Text Editor (note: must be installed)

You can use other software products like Microsoft Excel, but we know that accessing those products may require subscriptions (at a cost). You can also use plain text editors that are free with your computer like NotePad for Windows or TextEdit for Macs.

We will be working within the Google ecosystem since Google’s free account often comes with sufficient storage for starting a community archiving project.

Installing Text Editor App in Google Drive

If you have a community Google account setup, make sure you are logged into your community Google account before you install the Text Editor app. More detailed instructions are provided by Google, but here are the basic steps:

  1. Check to make sure you are logged into the proper Google account
  2. Visit: workspace.google.com/marketplace
  3. Search for “text editor.” This will return a number of results. We recommend installing:
    1. Text Editor (free but with advertisements, large user pool)

Keeping Track of your Metadata

By far, the easiest way to keep track of your metadata before setting up your site is to use spreadsheet software such as Excel or Google Sheets, depending on your specific data collection procedures.

DigitalArc Spreadsheet Template

Included is a spreadsheet template for collecting your metadata for each item that, with a few minor steps, you can copy and paste into a Word document or Google document as you prepare the item records for publishing. Once you are comfortable with creating item records in GitHub, you can create these files directly in GitHub (using Markdown).

The template contains two tabs: “Item Descriptions” and “Instructions.”

Item Descriptions Tab

  1. Column A
    1. Contains the labels required by the DigitalArc web application/platform. DO NOT change or edit these labels.
  2. Columns B+
    1. Each column should represent a distinct item.

Instructions Tab

Essential instructions are included here for quick reference with a link back to this document, if additional details are needed.

Categorizing Metadata

As you work on gathering and compiling the metadata for each item contributed to your online community archive, DigitalArc provides two important ways to make your items more discoverable: Tags and Categories.

Tags

Tags can be more free-form, assigned as metadata records are created per item. Tags are natural outcomes from community-based metadata creation; they are easier to define and assign primarily because they reflect community-specific language. While it is recommended to maintain a project-specific list of tags for re-use, it is also important to reflect as a concept or theme what the item represents. For example, if a contributor submits a postcard, tags may include: postcard, travels, Chicago (origin or destination of the postcard), Barack Obama Presidential Library (key topics or places mentioned in the postcard). You’ll want to provide no more than 5-10 tags per item.

Tags are searchable, but they are not currently used to filter results like Categories and Type (see below).

Categories

As you add items to the metadata spreadsheet, you’ll have a cross-cutting view of the items contributed by community partners. The next step is to determine overarching themes or categories. You might already have a sense of this before compiling metadata, or you may need to compile all the metadata before formulating categories. Often, analyzing the tags by grouping similar tags will point to a core set of categories for browsing your community digital archive.

The goal is to create a limited number of overarching categories (<20) to facilitate browsing across your digital archive. Categories can be based on time like decades (1900s, 1910s, etc.) or topics (Family, Religion, etc.) or types of content (Photographs, Oral Histories, Recipes, etc.). They can also be a mixture of these.

The metadata for a specific item will, naturally, be particular to that item. This will include unique, identifying details, and may include information related to the individual who submitted it.

If a participant has opted not to have their name included, be sure that you do NOT include any personal data about the individual(s) who submitted the item.

Preparing Your Metadata Before Making Your Website

While you don’t need to start building your site yet, it helps to understand how to format the information above into a structure that will make building your site easier. We’ll cover how to use these metadata fields to actually build a web site in the section on Posting Items when you’re ready to Publish Your Site.

Special Characters in Metadata

Your metadata cannot include double quotes (“”) and colons (:). If quotes are absolutely essential, use single quotes (‘) instead.

Metadata Line-by-Line

The DigitalArc web site is able to process the following metadata fields. The bolded fields are required, even if they are blank or have values like “Unknown”:

layout: item – This never changes

format: photo – This controls the list of checkboxes that show on the right hand page of your main collection page. While you can use other formats, these are the 4 item formats that you’re likely to encounter:

  • photo (includes photos of objects or people)
  • document (includes scans of documents like memos, letters, or newspaper articles)
  • media (includes embedded audio or video created by members of the community)
  • outsidelink (includes social media posts, newspaper articles that need to be linked outside of the community-archive site)

title: “ACLS Digital Justice Development Grant” – The title of the item

author: “Michelle Dalmau, Kalani Craig, Vanessa Elias, Jazma Sutton” – This can be blank. It’s used only if the item is a written/audio/video contribution and author and creator are different than the person who contributed the item.

contributor: “Michelle Dalmau, Kalani Craig, Vanessa Elias, Jazma Sutton” – The person or people who contributed the item to the collection.

group: “DigitalArc Grant Startup” – If you’re sorting your archive into family or content groups, this metadata field helps make that easier. This feature hasn’t been implemented yet. Coming soon! For now, you can use Category entries to make family groups easier to see

creator: “DigitalArc Platform Team” – The creator of the item (a brand, a person, a collective). If you want to list both authors and a publisher or a collective name for a group of authors, the creator metadata field makes that easier.

externalurl: https://www.acls.org/recent-fellows/?program_id=40090&_project_year=2024 – If the item originated as an outside link or has a social-media link that you want to link to, put that link here.

embedurl: – If the item originated as an outside link or has a social-media link that you want to include, put that link here.

creationdate: “May 22, 2024” – When was the item created? Dates can be exact, as they are in this example, or approximate like “circa 1985” or 1980s.

type: Website – The information listed in “type,” if used on the web site, will serve as a filter based on item types when searching or browsing your digital archive.. Type could be format-specific like: image, audio, video or genre-specific like: photographs, journals/diaries, recipes, interviews, etc. You can reference controlled lists from the Library of Congress or Dublin Core Type vocabulary. You can also create a community-specific list of types that you tailor for your content and manage for consistency.

If the community is using full metadata, this item type should be based on Library of Congress item types, which is more specific than the format metadata field above.

shortdesc: “This is an example of how to include a document (scanned or screen-captured). The development of the DigitalArc Toolkit was funded by an ACLS Digital Social Justice grant in 2024.” – This is the short description about the item that shows up on the all-items list page.

tags: [Community, Archives, Toolkit, Project Planning, Publishing] – This should contain a list of free-form tags, up to 20, that describe essential information about the item. They can be names of people of places, topics (more specific than categories), dates, events, purpose of item, format, etc. Enter tags in square brackets [ ] with each category separated by a comma.

categories: [ News Articles, Web Pages ] – This metadata field controls the filter buttons that show up on the main collection page. You’ll want to keep this list to under 20, if possible. Enter categories in square brackets [ ] with each category separated by a comma. What categories is your community using to group items by type? Separate these by commas.

teammember: – If a team-member did considerable work prepping an item or helping a contributor, they can be credited here.

Example of Metadata Layout for Publishing

Metadata starts and ends with a line that has three dashes and nothing else. (the dash next to the “zero” key on your keyboard). If something goes wrong with an item, the first thing to check is to make sure there’s no space after the three dashes.

Each item on your site will need the following information at minimum:

--- 
layout: item 
format: document 
title: "ACLS Digital Justice Development Grant" 
author: "DigitalArc Platform Team" 
contributor: "DigitalArc Platform Team" 
creator: "DigitalArc Platform Team" 
creationdate: 2024-05-22 
type: "website" 
shortdesc: "This is an example of how to include a document (scanned or screencaptured). The development of the DigitalArc Toolkit was funded by an ACLS Digital Social Justice grant in 2024." 
categories: [ News Articles, Web Pages ] 
--- 

Some items may have more metadata fields, or blank information, like so:

--- 
layout: item 
format: document 
title: "ACLS Digital Justice Development Grant" 
author: "DigitalArc Platform Team" 
contributor: "DigitalArc Platform Team" 
group: "DigitalArc Grant Startup" 
creator: "DigitalArc Platform Team" 
externalurl: https://www.acls.org/recent-fellows/?program_id=40090&_project_year=2024 
embedurl:  
creationdate: "May 22, 2024" 
type: "website" 
shortdesc: "This is an example of how to include a document (scanned or screencaptured). The development of the DigitalArc Toolkit was funded by an ACLS Digital Social Justice grant in 2024." 
categories: [ News Articles, Web Pages ] 
teammember:  
---

Prepping Metadata for Publishing

Step 1: Compiling the Spreadsheet

  1. Copy the metadata spreadsheet template to your community Google Drive
  2. Add metadata to your spreadsheet
  3. Determine metadata categories and update them on the metadata spreadsheet

Step 2: From Spreadsheet to Plain Text Document

  1. If you haven’t already done so, install the Text Editor app in your community’s Google Drive
  2. Create plain text metadata documents per item following the example layout above

Step 3: Create an Item Text File in GitHub

  1. See Posting Items documentation