PICA::Data - PICA record processing
use PICA::Data ':all';
$parser = pica_parser( xml => 'picadata.xml' );
$writer = pica_writer( plain => \*STDOUT );
use PICA::Parser::XML;
use PICA::Writer::Plain;
$parser = PICA::Parser::XML->new( @options );
$writer = PICA::Writer::Plain->new( @options );
use PICA::Schema;
$schema = PICA::Schema->new();
# parse records
while ( my $record = $parser->next ) {
# function accessors
my $ppn = pica_value($record, '003@0');
my $ppn = pica_match($record, '045Ue', split => 1, nested_array => 1);
my $holdings = pica_holdings($record);
my $items = pica_items($record);
...
# object accessors
my $ppn = $record->id;
my $ppn = $record->value('003@0');
my $ppn = $record->subfields('003@')->{0};
my $ddc = $record->match('045Ue', split => 1, nested_array => 1);
my $holdings = $record->holdings;
my $items = $record->items;
...
# write record
$writer->write($record);
# write methods
$record->write($writer);
$record->write( xml => @options );
$record->write; # default "plain" writer
# stringify record
my $plain = $record->string;
my $xml = $record->string('xml');
# validate record
my $errors = $schema->check($record);
}
# parse single record from string
my $record = pica_parser('plain', \"...")->next;
# guess parser from input string
my $parser = pica_guess($string)->new(\$string);
PICA::Data provides methods, classes, functions, and a command line application to process PICA+ records.
PICA+ is the internal data format of the Local Library System (LBS) and the Central Library System (CBS) of OCLC, formerly PICA. Similar library formats are the MAchine Readable Cataloging format (MARC) and the Maschinelles Austauschformat fuer Bibliotheken (MAB). In addition to PICA+ in CBS there is the cataloging format Pica3 which can losslessly be convert to PICA+ and vice versa.
Records in PICA::Data are encoded either as array of arrays, the inner arrays
representing PICA fields, or as an object with two keys, _id
and record
,
the latter holding the record as array of arrays, and the former holding the
record identifier, stored in field 003@
, subfield 0
. For instance a
minimal record with just one field (having tag 003@
and no occurrence):
{
_id => '12345X',
record => [
[ '003@', undef, '0' => '12345X' ]
]
}
or in short form:
[ [ '003@', undef, '0' => '12345X' ] ]
PICA path expressions (see PICA::Path) can be used to facilitate processing PICA+ records and PICA::Schema can be used to validate PICA+ records with Avram Schemas.
The following functions can be exported on request (use export tag :all
to
get all of them):
Return a new PICA::Data object from any guessable serialization form (or die).
Return a new PICA+ field as blessed PICA::Data::Field array reference (or die).
Create a PICA parsers object (see PICA::Parser::Base). Case of the type is ignored and additional parameters are passed to the parser's constructor:
- PICA::Parser::Binary for type
binary
(binary PICA+) - PICA::Parser::Plain for type
plain
orpicaplain
(human-readable PICA+) - PICA::Parser::Plus for type
plus
orpicaplus
(normalized PICA+) - PICA::Parser::Import for type
import
(PICA Import format) - PICA::Parser::JSON for type
json
(PICA JSON) - PICA::Parser::XML for type
xml
orpicaxml
(PICA-XML) - PICA::Parser::PPXML for type
ppxml
(PicaPlus-XML) - PICA::Parser::PIXML for type
pixml
(PICA FOLIO Import XML) - PICA::Parser::Patch for type
patch
(PICA Patch format)
Guess PICA serialization format from input data. Returns name of the
corresponding parser class or undef
.
Convert PICA-XML, expressed in XML::Struct structure into a PICA::Data object.
Create a PICA writer object (see PICA::Writer::Base) in the same way as
pica_parser
with one of
- PICA::Writer::Binary for type
binary
(binary PICA) - PICA::Writer::Generic for type
generic
(PICA with self defined data separators) - PICA::Writer::Plain for type
plain
orpicaplain
(human-readable PICA+) - PICA::Writer::Import for type
import
(PICA Import format) - PICA::Writer::Plus for type
plus
orpicaplus
(normalized PICA+) - PICA::Writer::JSON for type
json
(PICA JSON) - PICA::Writer::XML for type
xml
orpicaxml
(PICA-XML) - PICA::Writer::PPXML for type
ppxml
(PicaPlus-XML) - PICA::Writer::PIXML for type
pixml
(PICA FOLIO Import XML) - PICA::Writer::Patch for type
patch
(PICA Patch format)
Stringify a record with given writer (plain
as default) and options.
Equivalent to PICA::Path->new($path).
Equivalent to PICA::Path->match_record($path, %options).
Extract the subfield values from a PICA record based on a PICA path
expression and options (see PICA::Path). Also available as accessor
match($path, %options)
.
Extract the first subfield values from a PICA record based on a PICA path
expression. Also available as accessor value($path)
.
Extract a list of subfield values from a PICA record based on a PICA path expression. The following are virtually equivalent:
pica_values($record, $path);
$path->record_subfields($record);
$record->values($path);
Returns a PICA record (or empty array reference) limited to fields optionally specified by PICA path expressions. The following are virtually equivalent:
pica_fields($record, $path);
$path->record_fields($record);
$record->fields($path);
Returns a Hash::MultiValue of all subfields of fields optionally specified
by PICA path expressions. Also available as accessor subfields
.
Returns the record limited to level 0 fields ("title record") in sorted order.
Returns a list (as array reference) of local holding records, sorted by ILN.
Level2 fields are included in sorted order. The ILN (if given) is available as
_id
. Also available as accessor holdings
.
Returns a list (as array reference) of item records. The EPN (if given) is
available as _id
Also available as accessor items
.
Returns the record splitted into individual records for each level. Optionally limits result to given level, including identifiers (PPN/ILN) of higher levels.
Returns a copy of the record with sorted fields (first level 1 fields, then
level 2 fields not belonging to a level 1, then level 1, each followed by level
2 sorted by EPN). Also available as accessor sort
.
Sorts and filters subfields of a PICA field (given as array reference) with an
subfield schedule.
The schedule can also be given as string of subfield codes, parsed with
parse_subfield_schedule: repeatable
subfields must be marked with *
or +
, otherwise or only the first
subfield of this code is preserved. Undefined and missing subfields are ignored
as well as subfield without information about its order. Returns the modified
field, unless it is empty.
Get or set a PICA field annotation. Use undef
to remove annotation.
Return the difference between two records as annotated record. Also available
as method diff
. See PICA::Patch for details.
Return a new record by application of a difference given as annotated PICA.
Also available as method patch
. See PICA::Patch for details.
Append a new field to the end of the record.
Change an existing field. This method can be used like method append
or with
two arguments (path and value) to replace, add or remove a subfield value.
Remove all fields matching given PICA Path expressions. Subfields and positions in the path are ignored.
Reduce and split record to given level except for identifiers PPN/ILN. Returns a list of records.
All accessors of PICA::Data
are also available as "FUNCTIONS", prefixed
with pica_
(see "SYNOPSIS").
Extract the subfield values from a PICA record based on a PICA::Path
expression and options (see method match
of PICA::Path).
Extract a list of subfield values from a PICA record based on a PICA::Path expression.
Same as values
but only returns the first value.
Returns a PICA record limited to fields specified in a PICA::Path expression. Always returns an array reference.
Returns a Hash::MultiValue of all subfields of fields optionally specified by PICA path expressions.
Returns a list (as array reference) of local holding records (level 1 and 2),
where the id of each record contains the ILN (subfield 101@a
).
Returns a list (as array reference) of item records (level 1),
where the id of each record contains the EPN (subfield 203@/**0
).
Returns the record id, if given.
Tell whether the record is empty (no fields).
Reduce and split the record into title record (level=0), holding records (level=1) or copy/item records (level=2). PPN and ILN are included for level 1 and 2 respectively.
Write PICA record with given PICA::Writer::... or PICA::Writer::Plain by default. This are equivalent:
pica_writer( xml => $file )->write( $record );
$record->write( xml => $file );
Serialize PICA record in a given format (plain
by default). This method can
also be used as function pica_string
.
Add a field to the end of the record. An occurrence can be specified as part of the tag or as second argument. Subfields with empty value are ignored, so the following are equivalent:
$record->append('037A/01', a => 'hello', b => 'world', x => undef, y => '');
$record->append('037A', 1, a => 'hello', b => 'world');
To simplify migration from PICA::Record the field may also be given as instance of PICA::Field but this feature may be removed in a future version.
Remove all fields matching given PICA Path expressions. Subfields and positions are ignored so far.
Can be used like method append
but replaces an existing field. Alternatively
changes selected subfields if called with two arguments:
$record->update('012X$a', 1); # set or add subfield $a to 1, keep other subfields
Setting a subfield value to the empty string or undef
removes the subfield.
Calculate the difference of the record to another record.
Calculate a new record by application of an annotated PICA record. Annotations
+
and -
denote fields to be added or removed. Fields with blank
annotations are check to exist in the original record.
The records should not contain multiple records of level 1 and/or level 2.
Johann Rolschewski, <[email protected]>
Jakob Voß <[email protected]>
Carsten Klee <[email protected]>
Copyright 2014- Johann Rolschewski and Jakob Voss
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
- picadata command line script to parse, serialize, count, and validate PICA+ data.
- Use Catmandu::PICA for more elaborated processing of PICA records with the Catmandu toolkit.
- PICA::Record implemented an alternative framework for processing PICA+ records (deprecated!).