MARC records from vendors: Difference between revisions

(Added Adam Matthew databases)
(→‎ACLS Humanities E-Book collection: Added info about dupes)
(41 intermediate revisions by the same user not shown)
Line 7: Line 7:
*983__ $a Not-vault
*983__ $a Not-vault


==Online resources==
==Online facsimiles==
*500__ $a This record was provided by a vendor. It may contain incorrect or incomplete information. $5 DFo
'''Delete:'''
*980__ $a BIB
* 506 Restrictions on access (Folger reserves this for physical restrictions; license restrictions are implicit in 852$h)
*983__ $a Online
* 538 ("Mode of access: Internet" and "Mode of access: WWW" are no longer necessary)
'''Replace, if necessary:'''
* LDR/06: do not use "m" for digital or digitized documents (electronic aspects are coded in 006 and 007 instead)
* LDR/06: for digitized manuscripts, use codes t, d, and f (Folger practice is contrary to OCLC, which considers digitized manuscripts to be "published" resources)
* 008/23: use "o" ("online") not "s" ("electronic")
* 006 for digital and digitized documents: <code>m||||o|||d||||||||</code>
* 007 for digital scans of documents: <code>cr\un|---uuuua</code> (in theory, "color" and "antecedent/source" could be coded something other than "unknown" but in practice, we want the records to remain valid if a vendor replaces page images scanned from microfilm with page images scanned from the original)
* 007 for native ebooks: <code>cr\un|---uuuun</code>
* Link text in 852 other than $u if they don't make sense for the Folger catalog
'''Move:'''
* Vendor 001 and 003 move to 035, in this format and order (unless a different match point is required): (003)001
'''Add:'''
  245%% $$h[electronic resource]
  336__ $$atext$$btxt$$2rdacontent
  337__ $$acomputer$$bc$$2rdamedia
  338__ $$aonline resource$$bcr$$2rdacarrier
  500__ $$aThis record was provided by a vendor. It may contain incorrect or incomplete information.$$5 DFo
  5831_ $$abatch edited$$cyyyy-mm-dd$$kEB$$xMarcEdit Task$$2local$$5DFo
  5831_ $$abatch loaded$$cyyyy-mm-dd$$kEB$$xfrom edited vendor records$$2local$$5DFo
  852__ $$aUS-DFo$$bEleRes$$hAvailable offsite via https://request.folger.edu
  980__ $$aBIB
  983__ $$aOnline
'''Don't worry about:'''
*040$d (because editing by Folger staff is automatically recorded in TIND record history and manually recorded in MARC 583)


==List of resources with vendor-supplied record sets==
==List of resources with vendor-supplied record sets==
* [[ACLS Humanities E-Book collection]]
=== [[ACLS Humanities E-Book collection]]===
* Oxford Scholarly Editions Online (OSEO)
* 5,472 records (plus 2 duplicates)
* Adam Matthew databases [https://www.amdigital.co.uk/support/marc-records https://www.amdigital.co.uk/support/marc-records]
* Remove duplicates from original .mrc file based on 035
**Early Modern England: 772 records
* Run MARCValidator on deduped source file to make sure nothing more needs to be added to the MarcEdit tasks
**Literary Manuscripts Leeds: 190 records
** Don't worry about the repeated 010 fields
**Literary Print Culture: 1,908 records
* Delete 776$q
**Perdita Manuscripts, 1500-1700: 216 records
* Replace <code>=830  \\$aACLS</code> with <code>=830  \0$aACLS</code>
**Shakespeare in performance: 1,138 records
* Change <code>=830  \0$aATLA Special Series.  0$aACLS Humanities E-Book.</code> to have a separate 830 for $aACLS Humanities E-Book
**Shakespeare's Globe Archive: 1,759 records for documents, 26 records for non-musical recordings (oral history interviews)  
* Change <code>=830  \0$aATLA Special Series.  0$aAmerican philosophy series ;$vno. 7.</code> to have a separate 830 for American philosophy series
**Virginia Company Archives: 2,899 records
* Delete MARC 506
* Gale databases
* Delete MARC 538
**Eighteenth century collections online (ECCO)
* Delete MARC 773
* ProQuest databases
* Replace LDR/07 d or a with m
**Early English books online (EEBO)
* Change 008/23 (form of item) from "s - electronic" to the more specific "o - online"
* Replace 006 with <code>m||||o|||d||||||||</code>
* Replace 007 with <code>cr\un|---uuuun</code> (most are okay as-is, but about two dozen have too many characters in the string)
* Delete 856$z
* Add RDA 33X fields:
** 336__ $a text $b txt $2 rdacontent
** 337__ $a computer $b c $2 rdamedia
** 338__ $a online resource $b cr $2 rdacarrier
* Add standard HBCN, 583, 980, 983
* Add 852__ $$aUS-DFo$$bEleRes$$hAvailable onsite only '''but''' use the list of Open Access titles to batch change the $h to Freely available after records are loaded.
 
=== Oxford Scholarly Editions Online (OSEO)===
*272 records
*Remove individual records for each volume of Martin Wiggins and Catherine Richardson, British Drama 1533–1642: A Catalogue in favor of one record for the entire set ([https://catalog.folger.edu/record/544900?ln=en rec id 544900]): subject headings are identical, and tables of contents are very incomplete (e.g. v. 1 covers 1533-1566 with entries for 440 plays, but table of contents only has entries for 37 plays, and goes to the start of 1537). Note: punctuation in 245$a varies, so search by field "F#:245$c Martin Wiggins and Catherine Richardson (eds)" to find all nine in MarcEdit.
*Note: MARC 008 for both volumes of The correspondence of Sir Philip Sidney is incorrect. Vendor record has <code>130909s2012\\\\enka\\\fo|\\\o0|0\0\eng|d </code> but should have <code>130909s2012\\\\enka\\\fob\\\\001\0\eng|d</code>.
===Adam Matthew databases===
From [https://www.amdigital.co.uk/support/marc-records https://www.amdigital.co.uk/support/marc-records], login not required
*Early Modern England: 772 records (none uploaded)
**LDR always coded a, p, or r, never "t", so manuscripts can only be determined from 245$k such as "Manuscripts, essays, memoirs and music" or "Commonplace books; Essays" (which the database uses to generate the "Document type" field)
**All have encoded date nuuu and imprint "Marlborough, Wiltshire : Adam Matthew Digital" (except for seven that ''also'' have parts of the original imprint in the 264).
**6XX fields are all non-standard (except for 653, which is just keywords)
*Literary Manuscripts Leeds: 190 records (none uploaded)
**LDR is mix of "a" and "t" (though originals are all manuscripts)
**Fixed field date always coded for the database itself
**Imprint is always for the database itself
**300 is always for the original manuscript
**Most 650s should be 655s (and don't have a $v)
*Literary Print Culture: 1,908 records (none uploaded)
**LDR is always "a"
**Fixed field date always coded for the database itself
**Imprint is for the database itself
**300 is always "1 online resource"
**Nothing to indicate they're manuscripts (245$k is a category, e.g. =245  00$aBond following illegal publication of almanacs :$kLegal record; Financial record$g1723.
**Keywords in 653 provide the only 6XX access
*Perdita Manuscripts, 1500-1700: 216 records (none uploaded)
**LDR varies: a, p, t (but all are manuscripts)
**008 and imprint are for the online resource
**300 is "1 online resource"
**6XX are non-standard (plus 653 with keywords)
*Shakespeare in performance: 1,138 records (none uploaded)
**LDR always "a" (material includes manuscripts, graphic materials, published texts)
**008, 264, and 300 are all for the electronic resource
**245$k provides the general category (e.g. "Photograph", "Costume design", "Autograph; Manuscript; Letters")
**534 correctly has "$p Reproduction of:" but dates in $c are inconsistent, and ISBD punctuation is lacking
**Keywords in 653 provide the only 1XX, 6XX, or 7XX access
*Shakespeare's Globe Archive: 1,759 records for documents, 26 records for non-musical recordings (none uploaded)
**LDR/06 is "i" for sound files (oral history interviews) and "a" for everything else (incl. props, graphic materials, manuscripts)
**008, 264, and 300 are for the online resource
**100 has only $a and $e
**534 correctly has "$p Reproduction of:" but dates in $c are inconsistent, and ISBD punctuation is lacking
**The only 7XX is a constant "710  2\$aAdam Matthew Digital (Firm),$edigitiser."
*Virginia Company Archives: 2,899 records (none uploaded)
**LDR/06 is always "t"
**008, 264, and 300 are all for the online resource
**In all but about 20 records, the 245$a is the call number, 245$k is "Manuscripts" (even when item is a printed picture).
**520 is used for the item title
**6XX is non-standard
=== Gale databases===
* (by request, except for British Literary Manuscripts)
*British Literary Manuscripts
**from [https://support.gale.com/marc/ https://support.gale.com/marc/], login not required)
**17 records for source microfilm collection titles (leads to description and targeted search)
**7 records for targeted searches for each of seven parts of the Medieval literary and historical manuscripts in the Cotton Collection, British Library, London (links break after the tilde when followed in TIND ILS, but can be copy-and-pasted from the MARC)
**In addition to Online facsimiles changes described above:
***Delete 538
***Replace <code>=[LOCATIONID]</code> with <code>=wash46354</code>
***Replace existing 260 (which is for the microfilm) with 260 for Gale Cengage Learning, coded as current/latest publisher (that is, 260 3_ $a [Farmington Hills, Michigan] : $b Gale Cengage Learning, $c [2009?]). Note: Gale Cengage Learning now rebranded as Gale a Cengage Company, but BLM hasn't been updated (yet).
***008 change Date 1 to 2009, change 008/15-17 to miu, change 008/23 from "s" to "o"
***006 and 007 are okay as-is
***Replace any existing 300 with 300 $a 1 online resource; add same to records that lack a 300
*Burney Collection Newspapers
**1,057 records
**Must transform from MARC8 to UTF8
**Already has the wash46354 suffix in URL
**Remember that many 008s are coded for serials, not books
**Delete the one instance of ";$c?°." then run MarcEdit Degree sign format fix
**Remove duplicate 035s from source file (e.g. 35 records with =035\\$ageneral)
**Change 006 to <code>m||||o|||d||||||||</code>
**Change 007 to <code>cr\un|---uuuua</code>
**Change 008/23 from "s" to "o"
**Add 33X
**Delete any existing 533$n
**Update 520s for STC, Wing, and ESTC?
**Move 590$a to 533$n
**Delete 648
**Change 985 to 509
*Eighteenth century collections online
**184,371 records in three separate files
*Nichols Newspapers Collection
**660 records
*British Theatre, Music, and Literature: High and Popular Culture (from Nineteenth Century Collections Online)
**608 records
 
===ProQuest databases===
*Early English books online (EEBO): 133,109 records, with updates annually in November
 
==Standard procedure==
# Break file from .mrc to .mrk and open in MarcEditor
# Check file against list of standard changes (above) to make sure nothing has changed
# Tools > Manage tasks > [Dataset name]: update yyyy-mm-dd in 583s
# Tools > Assigned tasks > [Dataset name]
# Run any other relevant MarcEdit tasks (e.g. replacing degree sign in ESTC formats)
# Save as MARC21 XML
# Open in Notepad++
# Run Macroexpress macro to replace LDR with 000 and control field spaces with backslashes
# If necessary, split file into sets of about 1,000 records each and schedule uploads half an hour apart
# Batch upload in TIND


==Resources without vendor records for individual titles==
==Resources without vendor records for individual titles==
* Gale databases
* Gale databases
**State papers online : the government of Britain, 1509-1714.
**State papers online : the government of Britain, 1509-1714.
* ProQuest databases
**Cecil papers




Line 36: Line 168:
[[Category:MARC]]
[[Category:MARC]]
[[Category:Staff policies and procedures]]
[[Category:Staff policies and procedures]]
[[Category:TIND ILS]]

Revision as of 18:51, 17 June 2022

MARC records supplied by vendors require editing before they can be batch-loaded into the catalog. This page describes edits that need to be made to all vendor-supplied records.

Printed resources

Fields to add

  • 500__ $a This record was provided by a vendor. It may contain incorrect or incomplete information. $5 DFo
  • 980__ $a BIB
  • 983__ $a Not-vault

Online facsimiles

Delete:

  • 506 Restrictions on access (Folger reserves this for physical restrictions; license restrictions are implicit in 852$h)
  • 538 ("Mode of access: Internet" and "Mode of access: WWW" are no longer necessary)

Replace, if necessary:

  • LDR/06: do not use "m" for digital or digitized documents (electronic aspects are coded in 006 and 007 instead)
  • LDR/06: for digitized manuscripts, use codes t, d, and f (Folger practice is contrary to OCLC, which considers digitized manuscripts to be "published" resources)
  • 008/23: use "o" ("online") not "s" ("electronic")
  • 006 for digital and digitized documents: m||||o|||d||||||||
  • 007 for digital scans of documents: cr\un|---uuuua (in theory, "color" and "antecedent/source" could be coded something other than "unknown" but in practice, we want the records to remain valid if a vendor replaces page images scanned from microfilm with page images scanned from the original)
  • 007 for native ebooks: cr\un|---uuuun
  • Link text in 852 other than $u if they don't make sense for the Folger catalog

Move:

  • Vendor 001 and 003 move to 035, in this format and order (unless a different match point is required): (003)001

Add:

 245%% $$h[electronic resource]
 336__ $$atext$$btxt$$2rdacontent
 337__ $$acomputer$$bc$$2rdamedia
 338__ $$aonline resource$$bcr$$2rdacarrier
 500__ $$aThis record was provided by a vendor. It may contain incorrect or incomplete information.$$5 DFo
 5831_ $$abatch edited$$cyyyy-mm-dd$$kEB$$xMarcEdit Task$$2local$$5DFo
 5831_ $$abatch loaded$$cyyyy-mm-dd$$kEB$$xfrom edited vendor records$$2local$$5DFo
 852__ $$aUS-DFo$$bEleRes$$hAvailable offsite via https://request.folger.edu
 980__ $$aBIB
 983__ $$aOnline

Don't worry about:

  • 040$d (because editing by Folger staff is automatically recorded in TIND record history and manually recorded in MARC 583)

List of resources with vendor-supplied record sets

ACLS Humanities E-Book collection

  • 5,472 records (plus 2 duplicates)
  • Remove duplicates from original .mrc file based on 035
  • Run MARCValidator on deduped source file to make sure nothing more needs to be added to the MarcEdit tasks
    • Don't worry about the repeated 010 fields
  • Delete 776$q
  • Replace =830 \\$aACLS with =830 \0$aACLS
  • Change =830 \0$aATLA Special Series. 0$aACLS Humanities E-Book. to have a separate 830 for $aACLS Humanities E-Book
  • Change =830 \0$aATLA Special Series. 0$aAmerican philosophy series ;$vno. 7. to have a separate 830 for American philosophy series
  • Delete MARC 506
  • Delete MARC 538
  • Delete MARC 773
  • Replace LDR/07 d or a with m
  • Change 008/23 (form of item) from "s - electronic" to the more specific "o - online"
  • Replace 006 with m||||o|||d||||||||
  • Replace 007 with cr\un|---uuuun (most are okay as-is, but about two dozen have too many characters in the string)
  • Delete 856$z
  • Add RDA 33X fields:
    • 336__ $a text $b txt $2 rdacontent
    • 337__ $a computer $b c $2 rdamedia
    • 338__ $a online resource $b cr $2 rdacarrier
  • Add standard HBCN, 583, 980, 983
  • Add 852__ $$aUS-DFo$$bEleRes$$hAvailable onsite only but use the list of Open Access titles to batch change the $h to Freely available after records are loaded.

Oxford Scholarly Editions Online (OSEO)

  • 272 records
  • Remove individual records for each volume of Martin Wiggins and Catherine Richardson, British Drama 1533–1642: A Catalogue in favor of one record for the entire set (rec id 544900): subject headings are identical, and tables of contents are very incomplete (e.g. v. 1 covers 1533-1566 with entries for 440 plays, but table of contents only has entries for 37 plays, and goes to the start of 1537). Note: punctuation in 245$a varies, so search by field "F#:245$c Martin Wiggins and Catherine Richardson (eds)" to find all nine in MarcEdit.
  • Note: MARC 008 for both volumes of The correspondence of Sir Philip Sidney is incorrect. Vendor record has 130909s2012\\\\enka\\\fo|\\\o0|0\0\eng|d but should have 130909s2012\\\\enka\\\fob\\\\001\0\eng|d.

Adam Matthew databases

From https://www.amdigital.co.uk/support/marc-records, login not required

  • Early Modern England: 772 records (none uploaded)
    • LDR always coded a, p, or r, never "t", so manuscripts can only be determined from 245$k such as "Manuscripts, essays, memoirs and music" or "Commonplace books; Essays" (which the database uses to generate the "Document type" field)
    • All have encoded date nuuu and imprint "Marlborough, Wiltshire : Adam Matthew Digital" (except for seven that also have parts of the original imprint in the 264).
    • 6XX fields are all non-standard (except for 653, which is just keywords)
  • Literary Manuscripts Leeds: 190 records (none uploaded)
    • LDR is mix of "a" and "t" (though originals are all manuscripts)
    • Fixed field date always coded for the database itself
    • Imprint is always for the database itself
    • 300 is always for the original manuscript
    • Most 650s should be 655s (and don't have a $v)
  • Literary Print Culture: 1,908 records (none uploaded)
    • LDR is always "a"
    • Fixed field date always coded for the database itself
    • Imprint is for the database itself
    • 300 is always "1 online resource"
    • Nothing to indicate they're manuscripts (245$k is a category, e.g. =245 00$aBond following illegal publication of almanacs :$kLegal record; Financial record$g1723.
    • Keywords in 653 provide the only 6XX access
  • Perdita Manuscripts, 1500-1700: 216 records (none uploaded)
    • LDR varies: a, p, t (but all are manuscripts)
    • 008 and imprint are for the online resource
    • 300 is "1 online resource"
    • 6XX are non-standard (plus 653 with keywords)
  • Shakespeare in performance: 1,138 records (none uploaded)
    • LDR always "a" (material includes manuscripts, graphic materials, published texts)
    • 008, 264, and 300 are all for the electronic resource
    • 245$k provides the general category (e.g. "Photograph", "Costume design", "Autograph; Manuscript; Letters")
    • 534 correctly has "$p Reproduction of:" but dates in $c are inconsistent, and ISBD punctuation is lacking
    • Keywords in 653 provide the only 1XX, 6XX, or 7XX access
  • Shakespeare's Globe Archive: 1,759 records for documents, 26 records for non-musical recordings (none uploaded)
    • LDR/06 is "i" for sound files (oral history interviews) and "a" for everything else (incl. props, graphic materials, manuscripts)
    • 008, 264, and 300 are for the online resource
    • 100 has only $a and $e
    • 534 correctly has "$p Reproduction of:" but dates in $c are inconsistent, and ISBD punctuation is lacking
    • The only 7XX is a constant "710 2\$aAdam Matthew Digital (Firm),$edigitiser."
  • Virginia Company Archives: 2,899 records (none uploaded)
    • LDR/06 is always "t"
    • 008, 264, and 300 are all for the online resource
    • In all but about 20 records, the 245$a is the call number, 245$k is "Manuscripts" (even when item is a printed picture).
    • 520 is used for the item title
    • 6XX is non-standard

Gale databases

  • (by request, except for British Literary Manuscripts)
  • British Literary Manuscripts
    • from https://support.gale.com/marc/, login not required)
    • 17 records for source microfilm collection titles (leads to description and targeted search)
    • 7 records for targeted searches for each of seven parts of the Medieval literary and historical manuscripts in the Cotton Collection, British Library, London (links break after the tilde when followed in TIND ILS, but can be copy-and-pasted from the MARC)
    • In addition to Online facsimiles changes described above:
      • Delete 538
      • Replace =[LOCATIONID] with =wash46354
      • Replace existing 260 (which is for the microfilm) with 260 for Gale Cengage Learning, coded as current/latest publisher (that is, 260 3_ $a [Farmington Hills, Michigan] : $b Gale Cengage Learning, $c [2009?]). Note: Gale Cengage Learning now rebranded as Gale a Cengage Company, but BLM hasn't been updated (yet).
      • 008 change Date 1 to 2009, change 008/15-17 to miu, change 008/23 from "s" to "o"
      • 006 and 007 are okay as-is
      • Replace any existing 300 with 300 $a 1 online resource; add same to records that lack a 300
  • Burney Collection Newspapers
    • 1,057 records
    • Must transform from MARC8 to UTF8
    • Already has the wash46354 suffix in URL
    • Remember that many 008s are coded for serials, not books
    • Delete the one instance of ";$c?°." then run MarcEdit Degree sign format fix
    • Remove duplicate 035s from source file (e.g. 35 records with =035\\$ageneral)
    • Change 006 to m||||o|||d||||||||
    • Change 007 to cr\un|---uuuua
    • Change 008/23 from "s" to "o"
    • Add 33X
    • Delete any existing 533$n
    • Update 520s for STC, Wing, and ESTC?
    • Move 590$a to 533$n
    • Delete 648
    • Change 985 to 509
  • Eighteenth century collections online
    • 184,371 records in three separate files
  • Nichols Newspapers Collection
    • 660 records
  • British Theatre, Music, and Literature: High and Popular Culture (from Nineteenth Century Collections Online)
    • 608 records

ProQuest databases

  • Early English books online (EEBO): 133,109 records, with updates annually in November

Standard procedure

  1. Break file from .mrc to .mrk and open in MarcEditor
  2. Check file against list of standard changes (above) to make sure nothing has changed
  3. Tools > Manage tasks > [Dataset name]: update yyyy-mm-dd in 583s
  4. Tools > Assigned tasks > [Dataset name]
  5. Run any other relevant MarcEdit tasks (e.g. replacing degree sign in ESTC formats)
  6. Save as MARC21 XML
  7. Open in Notepad++
  8. Run Macroexpress macro to replace LDR with 000 and control field spaces with backslashes
  9. If necessary, split file into sets of about 1,000 records each and schedule uploads half an hour apart
  10. Batch upload in TIND

Resources without vendor records for individual titles

  • Gale databases
    • State papers online : the government of Britain, 1509-1714.
  • ProQuest databases
    • Cecil papers