- The following steps require Python 3; if you do not already have it installed, it is available to download here.
- Navigate to the finding aids website.
- Select the finding aid in question, and then view source. Save this xml as a txt file in your Python folder.
- Run the following Python script on the txt file you just saved, making sure to first edit the file name in the first line of the script to match the file name of your text file:
f = open('findingaid.txt','r', encoding='utf-8')
lines = f.readlines()
f.close()
item=""
newItem=False
titles = []
callnos = []
dates = []
start=0
end=0
i=0
for line in lines:
if "<c" and "level=\"item\"" in line:
newItem=True
elif "</c>" in line:
newItem=False
if newItem==True:
item=item+line
elif newItem==False and len(item)>0:
#get title
start=item.find("<unittitle>")
if start != -1:
start+=11
end=item.find("</unittitle>")
title=item[start:end]
title=title.replace("<persname>","")
title=title.replace("</persname>","")
title=title.replace("<famname>","")
title=title.replace("</famname>","")
title=title.replace("<corpname>","")
title=title.replace("</corpname>","")
title=title.replace("<p>","")
title=title.replace("</p>","")
title=title.replace("<title render=\"italic\">","")
title=title.replace("</title>","")
title=' '.join(title.split())
titles.append(title)
else:
titles.append("null")
start=0
end=0
#get callno
start=item.find("<unitid>")
if start != -1:
start+=8
end=item.find("</unitid>")
callno=item[start:end]
callnos.append(callno)
else:
callnos.append("null")
start=0
end=0
#get date
start=item.find("<unitdate>")
if start != -1:
start+=10
end=item.find("</unitdate>")
date=item[start:end]
dates.append(date)
else:
dates.append("null")
start=0
end=0
item=""
titlefile = open('FAtitles.txt','w', encoding='utf-8')
callnofile = open('FAcallnos.txt','w', encoding='utf-8')
datefile = open('FAdates.txt','w', encoding='utf-8')
for title in titles:
titlefile.write("%s\n" % title)
for callno in callnos:
callnofile.write("%s\n" % callno)
for date in dates:
datefile.write("%s\n" % date)
- After running this script, you will find 3 text files in your Python folder: FAcallnos.txt, FAdates.txt, and FAtitles.txt; these text files contain the extracted finding aid metadata, and will be used in the following steps.
Create a master spreadsheet of metadata from the finding aid
- Open a new spreadsheet, and in row 1, enter column headings as follows:
- column A: ItemPages
- column B: ItemSubTitle
- column C: ItemDate
- column D: ItemTitle
- column E: CallNumber
- Under the column headings, complete the spreadsheet as follows:
- column A: paste in the contents of FAcallnos.txt
- column B: paste in the contents of FAtitles.txt
- column C: paste in the contents of FAdates.txt
- column D: the title of the finding aid--copy this down all rows through the end of columns A, B, and C.
- column E: the call number range of the finding aid, e.g., Folger.MS.L.c.1-3950--copy this down all rows through the end of columns A, B, and C.
Create and use Aeon batches
- Copy the rows corresponding to the desired items (e.g., all of the items in a box) into a new spreadsheet, retaining the column headings. Save.