Adding records to Islandora: Difference between revisions

No edit summary
(9 intermediate revisions by the same user not shown)
Line 4: Line 4:


=Generating records=
=Generating records=
* Use bib hldgs pairs to generate records. Use script to generate bib hldgs pairs from call numbers if needed.
* Extend the records to have the correct number of child records for each parent.
* If imaging has generated rootfiles, add them to the child records.
** Otherwise, send the records to imaging for rootfiling
** Rename image files with rootfile names in IrfanView thumbnails
=Importing records=
* Upload images to S3
* Add S3 links to records
* Upload spreadsheet to Islandora
** After upload has processed, generate thumbnail for parent record
=Useful scripts=
==Script to generate Islandora records from given holdings-bib pairs==
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
import requests
import requests
Line 827: Line 841:
</pre>
</pre>


==Script to generate bib-holdings pairs from a list of call numbers==
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
import requests
from lxml import etree
from lxml.etree import fromstring
import csv
import json
from searchList import searchList
headers = {'Authorization': "Token APIKeyGoesHere"}
url1 = "https://catalog.folger.edu/api/v1/search"
url2 = ""
params2 = {"of":"xm"}
csvF=open("record.csv","w",newline='', encoding='utf-8')
fieldnames=["search string","callNum","bib id", "holdings id"]
writer=csv.DictWriter(csvF,fieldnames=fieldnames)
writer.writeheader()
i=1
for searchString in searchList:
callNum=""
bibID=""
hldgID=""
params1 = {"format":"id", "f": "callnumber", "p": searchString}
response = requests.request("GET", url1, headers=headers, params=params1)
jsonResponse = response.json()
if jsonResponse["total"]==1:
bib=jsonResponse["hits"][0]
bibID=str(bib)
url2="https://catalog.folger.edu/api/v1/record/"+bibID
r = requests.request("GET", url2, headers=headers, params=params2)
root = etree.fromstring(r.content)
#get hldg ID and call num--this only works for records with single holdings (for now)
for datafield in root.findall("datafield[@tag='852']"):
dict852={}
for subfield in datafield.findall("subfield"):
sfCode=subfield.attrib['code']#
sfValue=subfield.text
dict852[sfCode]=sfValue
sfCode=""
sfValue=""
callNum=""
hldgID=""
if "7" in dict852:
hldgID=dict852["7"]
if "k" in dict852:
callNum=dict852["k"]
if "h" in dict852:
callNum=callNum+" "+dict852["h"]
if "i" in dict852:
callNum=callNum+" "+(dict852["i"])
callNum=callNum.replace("  "," ")
callNum=callNum.rstrip("., ")
callNum=callNum.lstrip(" ")
print(callNum)
print(bibID)
print(hldgID)
writer.writerow({"search string":searchString,"callNum":callNum,"bib id":bibID,"holdings id":hldgID})
else:
numOfResponses=str(jsonResponse["total"])
writer.writerow({"search string":searchString,"callNum":"total record response "+numOfResponses,"bib id":"","holdings id":""})
</pre>
==Dictionary of relator terms==
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
relatorDictionary={
relatorDictionary={
Line 1,106: Line 1,194:
</pre>
</pre>


==List of Shakespeare quarto call numbers==
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
ShxQuartos=[
ShxQuartos=[
Line 1,185: Line 1,274:
</pre>
</pre>


==Sample dictionary of holdings-bib ID pairs==
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
holdingsToBibDictionary={
holdingsToBibDictionary={
"158300": "164478",
"230236":"128729"
"230236":"128729"
}
}
</pre>
</pre>
==Script to generate Islandora records from finding aid xml==
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
from lxml import etree
from collections import OrderedDict
import codecs
import copy
import io
import json
import re
import sys
import csv
csvF=open("islandoraRecord.csv","w",newline='')
fieldnames=["title","id","parent_id","field_resource_type","field_model","field_member_of","field_weight","field_identifier","field_linked_agent","field_creator","field_edtf_date","field_place_published","field_extent","field_rights","field_subject","field_note","field_classification","field_page_opening","field_contents","field_catalog_link","field_finding_aid_link","field_created_published","field_genre","field_iconclass_headings","field_bindings_features","field_bindings_terms","field_transcription","field_digital_image_type","field_microfilm_call_number","field_microfilm_reduction_ratio","field_microfilm_length","field_credit","field_sponsored_by","field_bib_id","field_holdings_id","field_display_hints","file","url_alias"]
writer=csv.DictWriter(csvF,fieldnames=fieldnames)
writer.writeheader()
filename="findingAid"
tree = etree.parse(filename+'.xml')
for elem in tree.getiterator():
    if not (
        isinstance(elem, etree._Comment)
        or isinstance(elem, etree._ProcessingInstruction)
    ):
        elem.tag = etree.QName(elem).localname
etree.cleanup_namespaces(tree)
   
nodeList = tree.xpath('//c[@level="item"]')
for node in nodeList:
    callNumber = ""
    accessionNumber = ""
    displayTitle = ""
    titleCreator = ""
    titleAgents = []
    titleLocationCreated = ""
    titleLocationReceived = ""
    locationCreated = {}
    agentCreator = ""
    agentRecipient = ""
    displayDate = ""
    scopecontent = ""
    bioghist = ""
    physfacet = ""
    oddp = ""
    notes = ""
    #date
    dateSearch = node.xpath('did/unitdate')
    for date in dateSearch:
        displayDate = date.text
    #identifier
    identifierSearch = node.xpath('did/unitid')
    for identifier in identifierSearch:
        callNumber = identifier.text
        print(callNumber)
    #title
    titleSearch = node.xpath('did/unittitle')
    for title in titleSearch:
        displayTitle += "".join(title.itertext())
    #notes
    abstractSearch = node.xpath('scopecontent/p')
    for abstract in abstractSearch:
        scopecontent += " ".join(abstract.itertext())
        scopecontent = scopecontent.replace("\n"," ")
    #notes
    noteSearch = node.xpath('bioghist/p')
    for note in noteSearch:
        bioghist += "".join(note.itertext())
        bioghist = bioghist.replace("\n"," ")
    generalNoteSearch = node.xpath('did/physdesc/physfacet')
    for generalNote in generalNoteSearch:
        physfacet += "".join(generalNote.itertext())
    oddNoteSearch = node.xpath('odd/p')
    for oddNote in oddNoteSearch:
        oddp += "".join(oddNote.itertext())
        oddp = oddp.replace(" \n"," ")
    notes='{0} {1} {2} {3}'.format(scopecontent,bioghist,physfacet,oddp)
    notes.replace("  "," ")
    writer.writerow({"title":displayTitle,"id":"","parent_id":"","field_resource_type":"","field_model":"","field_member_of":"","field_weight":"","field_identifier":"","field_linked_agent":"","field_creator":"","field_edtf_date":"","field_place_published":"","field_extent":"","field_rights":"","field_subject":"","field_note":notes,"field_classification":callNumber,"field_page_opening":"","field_contents":"","field_catalog_link":"","field_finding_aid_link":"","field_created_published":displayDate,"field_genre":"","field_iconclass_headings":"","field_bindings_features":"","field_bindings_terms":"","field_transcription":"","field_digital_image_type":"","field_microfilm_call_number":"","field_microfilm_reduction_ratio":"","field_microfilm_length":"","field_credit":"","field_sponsored_by":"","field_bib_id":"","field_holdings_id":"","field_display_hints":"","file":"","url_alias":""})
</pre>
=Adding images to S3=
=Adding images to S3=
=Importing records to Islandora=
=Importing records to Islandora=
=Adding links to the catalog=
=Adding links to the catalog=
https://mjordan.github.io/islandora_workbench_docs/generating_csv_files/#using-a-drupal-view-to-identify-content-to-export-as-csv
==Script to add links to the catalog using a CSV of link text and record IDs==
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
import requests
import requests
Line 1,213: Line 1,400:
response = requests.request("POST", URL, data=payload, headers=headers, params=params)
response = requests.request("POST", URL, data=payload, headers=headers, params=params)
</pre>
</pre>
==Sample CSV file with link text and record IDs==
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
<pre style="min-height:38px; margin-left:2em" class="mw-collapsible mw-collapsed" data-expandtext="Expand to see script">
bib id,856 $u,856 $z
bib id,856 $u,856 $z

Revision as of 12:39, 26 March 2025

This page is under construction

Generating records

  • Use bib hldgs pairs to generate records. Use script to generate bib hldgs pairs from call numbers if needed.
  • Extend the records to have the correct number of child records for each parent.
  • If imaging has generated rootfiles, add them to the child records.
    • Otherwise, send the records to imaging for rootfiling
    • Rename image files with rootfile names in IrfanView thumbnails

Importing records

  • Upload images to S3
  • Add S3 links to records
  • Upload spreadsheet to Islandora
    • After upload has processed, generate thumbnail for parent record

Useful scripts

Script to generate Islandora records from given holdings-bib pairs

Script to generate bib-holdings pairs from a list of call numbers

Dictionary of relator terms

List of Shakespeare quarto call numbers

Sample dictionary of holdings-bib ID pairs

Script to generate Islandora records from finding aid xml

Adding images to S3

Importing records to Islandora

Adding links to the catalog

https://mjordan.github.io/islandora_workbench_docs/generating_csv_files/#using-a-drupal-view-to-identify-content-to-export-as-csv

Script to add links to the catalog using a CSV of link text and record IDs

Sample CSV file with link text and record IDs