Interview with Julia Kim
julia-cropped

Julia Kim at NYU (photo credit: Elena Olivo)

 

Hey Julia! Introduce yourself please.

Hi, I’m Julia Kim. I’m the Digital Assets Specialist at the American Folklife Center at the Library of Congress. I’ve been here for just about 2 years now. I get to work with a good mix of both digitized and born-digital multi-format collections. While we create collections documenting concerts and events, the heart of our collections are ethnographic materials from collectors, folklorists, and anthropologists. AFC highlights include the Lomax family archival collections, the indigenous Native American wax cylinders, StoryCorps, and “Web Cultures,” but in-between, we get a little bit of everything that is evidence of “folk” around the world. I’m considered a specialist, but I do a bit of everything in my day-to-day work.

What does your media ingest process look like? Does your media ingest process include any tests (manual or automated) on the incoming content? If so, what are the goals of those tests?

Great question. We have different workflows for in-house created versus externally produced, and vendor-digitized collections. The collections themselves also are processed to very different levels depending on factors like quality, difficulty, extent size and types, and staff. For now though, only several very special collections get MediaConch treatment and love, but this is all great preparatory work for a future division-wide time-based media digitization push AFC is in the very beginning stages of.

Any vendor digitized still image collection goes through technical assessors to check against file headers and specifications, similarly, we also have bulk processes in place to QC and sample still images. These checks have been integrated into our repository system and are available upon copying, verifying checksums, and running malware scans on content on ingest servers. Audiovisual and audio content (the bulk of our collections), however, generally runs through checks and processes that are external to our repository environment. This means a mix of tools and software like exiftool, mediainfo, bwf metaedit, sleuthkit, ftk, exact audio copy, exactly, and … it can go on. The time-based media in our collections are a great challenge. Sometimes these tools come into play after the SIP is ingested to prepare the AIP, sometimes they are used before. Regardless, tools help identify, confirm, and even migrate content to give them a better chance at longterm preservation. Digital preservation as simply copying files to geographically dispersed and backed-up linear tape is no longer sufficient; our jobs are a lot harder now. While we have a few command-line scripts cobbled together and repository tools that work en bulk, I would say that we also rely on a lot of manual processes as well. So… it’s a bit of a smorgasbord that is collection-dependent.

Where do you use MediaConch? Do you use MediaConch primarily for file validation, for local policy checking, for in-house quality control, for quality testing for vendor files?

So far, I’ve primarily used MediaConch to create reports for new and incoming born-digital video from the Civil Rights History Project (Apple Pro Res 4:2:2, 10TB unique) and the DPX files (1536, 10bit, printing density, 20 TB unique) from digitizing celluloid film. Both of these collections share a few factors in common: they’re really important to the department, they’re extremely large in size, and as is, they present some technical difficulties.

Years ago, some of the first accessions of the CRHP collections were corrupted. In a post-ingest analysis, technical metadata fields created betrayed some indications such as truncated audio streams. While all the content was recovered, I decided to adapt workflows this with the new accession and advocated for creating some extremely granular reports as part of the archival package. The challenge now is to sit down and review the reports effectively.

With DPX, we didn’t get checksums for some of the batches. With that baseline measure gone, I knew I needed to find something else to ensure that each of the easily 7,000 – 50,000 files per directory were at least mirroring each other’s specifications. I reached out to the hive mind and MediaConch was highly recommended (thanks, Katharine!).

Initially, after creating the XML reports for each of the collections I was using, our staff used the GUI, but MediaConch would conk out when we tried to point to DPX directories; even the modest sub 10,000 files were simply too many. After several rounds of freezing my Mac and then uninstalling and reinstalling MediaConch, I realized I should just integrate a script. It was much easier than I thought it would be to set-up and use right away. Also, it’s great to use MediaConch in all three ways in which the developers have made it available. I like the browser-based version for storing local policies and comparing local policies against the public policies other users have generously shared and made available. It’s really useful for thinking about real-world time-based video specification, too. I was silly when I crashed my computers and had not downloaded my created policies for future re-use (fail!), so this is a great and easily accessible policy back-up. The GUI is also just incredibly easy and simple, too. I have trained staff to use it in minutes, which is not normal for implementing new software. Obviously though, with the previously unencountered numbers of files per film created when digitizing celluloid to DPX, I had to use the command line. At this point, I’m going to start reviewing the reports created and, again, I think that’s when I need to really think about making good use of all the data created. While some of this is a “store it and forget it” thing, I want it to be used much more actively as well. I’d be really curious to know how other people use and (re)package reports…

At what point in the archival process do you use MediaConch?

At the end, at least right now. That should change soon, but as a new staff member at AFC, I’m still catching up on various backlogs… although as I say that I think there will always be some sort of backlog to catch-up to! The collections are all actually already copied and checksummed on longterm servers long before I’ve used MediaConch with them… at least so far. My number one priority with any newly acquired hard drives is to get them backed up to tape and into our repository systems. We’ve also had a lot of division staff turnover with the first 2 digital technicians and 1 (hybrid) archivist leaving (all promotions), and the current digital technician I’m working with also leaving very shortly. So, excuses aside, I’m probably using MediaConch against my preconception of how I would have implemented it in workflows. But this is all in keeping with my focus this past year to start reevaluating older already ingested digital collections. AFC has been collecting and engaging in digital preservation for a long time, but MediaConch and tools like it had not existed before.

Do you use MediaConch for MKV/FFV1/LPCM video files, for other video files, for non-video files, or something else?

I use it for video and non-video (DPX), but once I’m through with the 2 collections mentioned earlier, I expect to expand its application. September is also my annual policy review month for me here, so I’m hoping that through update specifications for future vendor work and donors. I have started to create piles of migrated LPCM, so… I’m hopeful that I’ll be playing with this more and more.

Why do you think file validation is important?

File validation, specification checking, and checksumming verifying are probably the bedrocks of digital preservation, regardless of format. They all answer the questions of: what is this on the server? Is it what I think it is? Is it what I wanted? This is incredibly important, but it can be difficult to justify because of the time it can take in highly distributed workflows. Problems with collection quality and ingests often only become apparent with access. Given the wild world of audiovisual file specifications, MediaConch’s work with FFV1 and Matroska is really amazing and forward thinking… and I’m excited for when I get to work with these them in the future.

Of course, file validation itself is still not enough for many file types. Many file types are often invalid, but knowing that a collection include invalid files is important for assessing preservation risks and understanding collection content. It can also help with creating clearer archival policies for supported versus less supported specifications – that gray area where we many donor-created digital collections fall into.

Anything else you’d like to add?

MediaConch has made my life better. I’m grateful to the stars behind the development of MediaConch! Thank you also to the European Commission for funding a tool that is critical to archival work today.


PREFORMA: smart solutions for digital preservation
pfo_marketing

© 2006 Jens Östman / National Library of Sweden

 

The PREFORMA project is looking for partners willing to deploy and/or further develop the file-format validation tools developed in the project and we would be happy to include your institution or organisation among our early adopters.

Download here the model letter of expression of interest to join the PREFORMA community and send it back to info@preforma-project.eu.

 

The importance of file checking for digital preservation

Digital preservation means taking precautions to ensure long term access to digital content. Each of the different variants of file formats and codecs held in digital archives should be checked periodically. This might necessitate migrating some content to new formats to mitigate the risk of files becoming obsolete or unusable in the future.

If digital files do not comply to the standard specification then even files of and identical format and using the same codecs can have different properties. This means that subsequent migration or conversion processes may yield unpredictable results, jepoardising preservation workflows.

 

Three steps to making digital data future proof

  1. Validate incoming file formats and codecs against their standard specification. Files that conform with their specification can be parsed, processed or rendered consistently by any software that honours the specification.
  2. If necessary, define custom acceptance criteria for archival content and validate whether incoming files comply with such criteria.
  3. Make these checks part of the processing workflow.

 

The PREFORMA solution

The PREFORMA tools help memory institutions check file conformance, define custom policies, and build an efficient ingest workflow. Download and try them at www.preforma-project.eu/open-source-portal.html!

Four independent modules

  • Implementation Checker: validates compliance with the specification in all respects.
  • Policy Checker: enforces custom institutional policies beyond the scope of the specification itself.
  • Reporter: produces customisable reports formatted for both human readability and automated parsing.
  • Metadata Fixer: carries out any corrections to file metadata, if necessary, to achieve conformance with the specification.

Three media file types

  • Electronic documents (PDF/A)
  • Still images (uncompressed TIFF)
  • Audiovisual files (FFV1 video and LPCM audio in a Matroska container)

Three adaptable program interfaces

  • Command line
  • GUI
  • Web-based

Three deployment options

  • Standalone executable available for most operating systems
  • Network deployment as a client-server application
  • Integration with third-party systems via APIs

Open Source

  • All software is released under the GPLv3+ and MPLv2+ open licenses
  • All digital assets are released under the Creative Commons license CC-BY v4.0

 

******************************

PREFORMA (www.preforma-project.eu) is a pre-commercial procurement project co-funded by the European Commission to enable memory institutions to take control of the conformity testing of digital files intended for long-term preservation. The intention is to reduce preservation costs, improve curation capacity and enhance competences in public organisations while reducing their reliance on individual vendors.

The PREFORMA consortium, coordinated by the National Archives of Sweden, comprises 15 partners from 9 European countries. These partners include national and local cultural organisations, audiovisual archives, public libraries, research centers, universities and SMEs.

During the project, a community of experts and users contributing to our work has grown to include more than 500 individuals from 50 countries across the globe. Through cooperation with institutions and organisations interested in validating the most common file-formats they curate, the PREFORMA tools are integrated into production environments worldwide.

 

Download here the PREFORMA brochure.


International Surrealism Now – artist Santiago Ribeiro in New York

Since July 19, 2017 until December 31, 2017 will be shown art from Santiago Ribeiro at Times Square Nasdaq OMX Group, New York city.

Nasdaq, MarketSite is located in New York City’s Times Square.

081017 surrealism small

The presentation of the works of art will be random, lasting several minutes in each session.

Until December 31, 2008, Nasdaq will exhibit painting of Santiago several times alternately.

Santiago Ribeiro, Portuguese surrealist painter, who has been dedicated to promoting the Surrealism of the 21st century, through exhibitions held in various parts of the world: Berlin, Moscow, Dallas, Los Angeles, Mississippi, Warsaw, Nantes, Paris, Florence, Madrid, Granada, Barcelona, Lisbon, Belgrade, Monte Negro, Romania, Japan, Taiwan and Brazil.

original


“Creative with Digital Heritage” – E-Space MOOC is again accepting enrollments for October 2017

For the second academic year, the E-Space MOOC “Creative with Digital Heritage” is accepting enrollments. Europeana Space was a EC-funded project about the creative reuse of digital cultural heritage, which concluded in March 2017 with “excellent” evaluation from the EC reviewers. On the framework of continued activities of the E-Space network beyond the end of the funding pèeriod, and after the great success of the first run in 2016-2017, this MOOC is repeated.

The educational idea behind the E-Space MOOC is to lower barriers to the access and reuse of cultural heritage content on Europeana and similar sources. Whether you are a student or teacher with an interest in cultural heritage, a GLAM professional, a developer or simply a cultural heritage enthusiast without prior technical knowledge, this MOOC is for you: how can you engage with and reuse the wealth of digital cultural heritage available online in many repositories such as Europeana? How can you become an active user of this content, using, remixing and reinventing it for your research, lessons, and development?

The course is free and requires an effort of 2-4 hours per week, over a 8-weeks lenght. The course is available on KU Leuven section at EdX platform.

mooc edx

What you’ll learn:

  • How to become creative with digital cultural heritage
  • What repositories, tools and APIs are available online
  • How to access and use them
  • How digital cultural heritage can be effectively and successfully reused
  • How to deal with Intellectual Property Rights in the context of reuse of digital cultural heritage

As the online availability of digital cultural heritage continues to grow, it becomes more and more important that users, from passive readers, learn to become active re-users. The mission of this course is to share the creative ways with which people use and re-use Europeana and digital cultural content, to demonstrate what Europeana can bring to the learning community, and to bring about the essential concept that cultural content is not just to contemplate, but to live and engage with.

More details and information, and link to enrolling is available here: http://www.europeana-space.eu/education/mooc/ 


Heritage, Tourism and Hospitality conference

hthic

The Heritage, Tourism and Hospitality conferences focus on the question: “How can tourism destinations succeed in attracting tourists while simultaneously engaging all stakeholders in contributing to the preservation of natural and cultural heritage?

A special theme this year will be: “Narratives for a World in Transition“. The organisers welcome contributions advancing the understanding of the role of storytelling and narrative approaches and techniques.

Storytelling is a multi-layered and multi-purpose phenomenon. Geographical destinations and tourism dynamics need people to tell and share stories to co-create heritage values and induce valuable tourist experiences.

Storytelling can play a role as a branding, marketing, stakeholder and visitor engagement, sustainable management and innovation strategy and tool. Knowledge of critical success factors and skills in narrative management are required.

pori2

In addition, in this world in transition, characterised by globalisation, continuous growth in tourism and mobility based on migrant citizenship, there is the need for researchers and practitioners alike to explore the possibilities of reframing tourism beyond “the tourist gaze” and study the interaction, dialogues and conflicts that arise between visitors, hosts and cultural institutions in the presentation and re-use of the past for touristic purposes.

HTHIC2017 will take place in Pori, Finland, on 27-29 September.

For more information: www.heritagetourismhospitality.org

pori


Interview with Marion Jaks
marion

Marion at work

 

Hey Marion! Introduce yourself please.

Hey Ashley! I am a video archivist at the Austrian Mediathek (Österreichischer Mediathek), the Austrian video and sound archive. My main area of work is the digitization of analogue videos and quality control of video files entering our digital archive. Since in the recent years more and more digital content is added to our collection, the dealing with digital files from various sources is becoming a growing part of my job.

What does your media ingest process look like? Does your media ingest process include any tests (manual or automated) on the incoming content? If so, what are the goals of those tests?

The Austrian Mediathek started with digitizing its audio collection in the year 2000 and its video collection in the year 2010. Since our analogue collection is still growing there is still an ongoing digitization demand at the Mediathek and we will continue our in-house digitization efforts. Therefore, the biggest share of ingested files to our digital archive are produced in the in-house digitization department. The digitization systems for audio and video both have quality control steps implemented. One main goal of quality control in our institution is to detect artifacts and to find out which artifacts are due to problems during the digitization process and can be undone through certain actions. Both digitization workflows have steps implemented where file metadata is read out, so that the operator can control if the right settings were used. In DVA-Profession, which we use for our video digitization workflow, ffprobe is used to provide the metadata. This is in my opinion an essential check because it can prevent human error. In digitization we all work with presets that determine the settings of the capture procedure, but still it can happen that someone presses the wrong button…

So in my opinion, for quality control of digitization processes the main questions are: is the digital file a proper representation of the analogue original? And is the produced file meeting all requirements of a proper archive master file? And for the latter MediaConch is a great help for archivists.

Where do you use MediaConch? Do you use MediaConch primarily for file validation, for local policy checking, for in-house quality control, for quality testing for vendor files?

In our institution the main use case for MediaConch are files that were produced outside of the default workflow procedure. For example, when we edit clips from already digitized videos, I use MediaConch to easily find out if the files meet the policy for our archival master.

At what point in the archival process do you use MediaConch?

I use MediaConch when receiving files that were produced outside of our regular workflows where other checks are already implemented in the system. At the moment this the case for files that were processed using editing software. At the Austrian Mediathek we are aiming to make as much of our digital (digitized) collection accessible to the public as possible. Due to rights issues often only parts of the videos are allowed to be published online – that’s where we need to produce special video clips. Those video clips are exported as an archive master in FFV1 so that we can easily produce further copies of the clips in the future. When we are planning a launch of a new project website those clips can easily count a few hundred. MediaConch is very helpful when there are a lot of files to check if all the export settings were set correctly – it saves a lot of time to check the files before any further work is done with them.

Do you use MediaConch for MKV/FFV1/LPCM video files, for other video files, for non-video files, or something else?

We use MediaConch to check if produced video files meet the criteria of our archive master settings. In 2010 our decision for an archival master was to produce FFV1/PCM in an AVI-container. In the near future we are thinking of changing the container to MKV. Peter Bubestinger published the policy of the Austrian Mediathek’s archival master copies: https://mediaarea.net/MediaConchOnline/publicPolicies

Why do you think file validation is important?

I think the most central point in archiving is being in control of your collection. In regards of a digital collection this means to know what kind of files you got and in what state they are in. With a growing digital collection, you will get all kinds of files in different codecs and containers. At the Austrian Mediathek we collect video and audio files from different institutions, professionals as well as private people – so our collection is very diverse and there is no way that we can prescribe any delivery conditions for files entering our institutions. Most of the donors of content do not know much about file formats, codecs, containers. At first I found it very surprising that even film/video professionals often cannot tell you what codec they use for their copies – after some years it is the other way round and I am impressed when filmmakers can tell me specifics about their files.

With this in mind the first step when receiving a digital collection is to find out what kind of files they are. Files that do not have a longer term perspective (e. g. proprietary codecs) should be normalized using a lossless codec like FFV1. With a very diverse collection file transcoding can be a demanding task, when your goal is to make sure that all features of the original are transferred to the archival master copy. Since we all are not immune to human error there must be checks implemented to make sure newly produced archival masters meet the defined criteria. Otherwise you can never be sure about your digital collection and therefore lost control over your archive: every archivist’s nightmare.

Anything else you’d like to add?

I think that MediaConch fully unfolds its abilities in the quality control of files produced by digitization vendors. A few years ago we did outsource the digitization of 16mm film material. I was involved in the quality control of the produced files. At that time MediaConch was not around and the simple checks if the digitized files met the criteria of e. g. codecs/container were very time consuming. Nowadays MediaConch makes tasks like that so much easier and faster. I also like very much that MediaConch has a great, easy to use GUI so that colleagues not used to using the command line can easily do file validation checks. Thank you very much for the great work you are doing at MediaConch!

Thanks Marion! For more from the Mediathek, check out their audio/video web portal!


veraPDF 1.8 released

veraPDF-logo-600-300x149The latest version of veraPDF is now available to download on the PREFORMA Open Source Portal. This release includes new PDF information in the features report and automatic configuration of the features extractor when applying custom policy profiles. There are also a number of low level PDF fixes and improvements which are documented in the latest release notes: https://github.com/veraPDF/veraPDF-library/releases/latest

This is the final release of veraPDF under the PREFORMA project. From September, veraPDF will make the transition from a funded project to a stand alone open source project.

 

Download veraPDF

http://www.preforma-project.eu/verapdf-download.html

 

Help improve veraPDF

Most of the changes in this release are in response to the constructive feedback we’ve received from our users. Testing and user feedback is key to improving the software. Please download and use the latest release. If you experience problems, or wish to suggest improvements, please add them to the project’s GitHub issue tracker: https://github.com/veraPDF/veraPDF-library/issues  or post them to our user mailing list: http://lists.verapdf.org/listinfo/users.

 

Getting started

User guides and documentation are published at: http://docs.verapdf.org.

 

Save the date!

Join us for our next webinar on 30 August ‘Life after PREFORMA, the future of veraPDF’. Registration opening soon.

 

About

The veraPDF consortium (http://verapdf.org/) is funded by the PREFORMA project (http://www.preforma-project.eu/). PREFORMA (PREservation FORMAts for culture information/e-archives) is a Pre-Commercial Procurement (PCP) project co-funded by the European Commission under its FP7-ICT Programme. The project’s main aim is to address the challenge of implementing standardised file formats for preserving digital objects in the long term, giving memory institutions full control over the acceptance and management of preservation files into digital repositories.


DPF Manager 3.5 available to download

dpf_news_3.5

 

A new version of the DPF Manager has been released!

There are several improvements in this update. The most significant one is the redesign of the reports section, where now it is easier (and much faster) to browse the analyzed reports, transforming between formats, and also converting a quick check to a full check. The global report has also been improved with a more visual and interactive panel.

Other interesting new functionalities and enhancements are the following:

  • The METS report is generated in all the cases (previously it was only genereated if there where no errors at all.
  • Download reports from the GUI functionality
  • Added pagination functionality to improve the efficiency (specially when thousands of files are checked)
  • Included execution time duration to global reports
  • Fixed bugs related to the zipped tiff files validation.
  • Delete report from global reports page
  • Sorting reports by the passed column now shows first the ones that have warnings
  • Bugfix when trying to send feedback with non-xml reports.
  • Compatibility with old reports (previous to version 3.3)

See the full list of new functionalities in the github tag page and the list of solved issues.


CEPROQHA project: Preservation and Restoration of Qatar Cultural Heritage through Advanced Holoscopic 3D Imaging

CEPROQHA (NPRP9-181-1-036) is a cooperative research project between Qatar University and Brunel University (UK), funded by the Qatar National Research Fund (QNRF) under the National Priorities Research Program (NPRP). As the State of Qatar transitions to a knowledge-based economy, it seeks to ensure that the nation’s path to modernity is rooted in its values and traditions, considering digitization tecnologies and tools  for the preservation of Qatar’s culture, traditions and heritage. As part of the Digital Content Program, the national plan aims to provide incentives for the development of a vibrant digital ecosystem through which future generations can tap into their past and create new expressions of Qatari culture on the global stage.

ceproqha

The global aim of this project is to develop a new framework for a cost-effective cultural heritage preservation using cutting-edge 3D Holoscopic imaging technology and archival tools. For this, CEPROQHA leverages its partners research and innovation expertise to achieve its objectives.

The documentation of cultural assets is inherently a multimedia process, addressed through digital representation of the shape, appearance and preservation condition of the Cultural Heritage (CH) object. CH assets are not clone-able physically or impeccably restorable, and hence their curation as well as long-term preservation require the leveraging of advanced 3D modelling technologies. However, this poses serious challenges since the generation of high quality 3D models is still very time-consuming and expensive, not least because the modelling is carried out for individual objects rather than for entire collections. Furthermore, the outcome of digital reconstructions is frequently provided in formats that are not interoperable, and therefore cannot be easily accessed and/or re-used and understood by scholars, curators or those working in cultural heritage industries, thereby resulting in serious risks menacing the sustainability of the reconstructions. Indeed, the digital model is progressively becoming the actual representation of a cultural heritage asset for anybody, anywhere and anytime, and therefore this project intends to acknowledge the changing role that reconstruction, visualisation and management now play in the curation of heritage and its analysis.

Project website: http://www.ceproqha.qa/


BIENNIAL EVENT / SINOPALE 2017

The 6th edition of Sinopale–International Sinop Biennial under the common title “Transposition” will take place in Sinop, Turkey, from 1th August to 17th September 2017.

Affirming the periphery, Sinopale brings international contemporary art to the city of Sinop by the Black Sea in Northern Turkey. Sinopale is a biennial event – investigating various forms of resistance and adaptation of local movements and initiatives from civil society, ecological activism and nongovernmental politics. The summer of 2017 will align artistic processes of sharing, commonality and difference. The artists will be working in situ. A horizontal collaborative structure will bring the invited artists together with the citizens of Sinop to create spaces for aesthetic, social and political practice.

sinopale2

Sinopale 6 will revolve around the notion of Transposition. Transposition is a word with several meanings, which all signify a shift between values. Playing with the gap in-between, it becomes possible to open room for process and transference, to open a conceptual space as well as a scope for action and imagination. Sinopale 6 furthermore taps into dialogues on the history of material and cultural memory in order to create an associative space full of cross-references.

Sinopale is a young biennial for contemporary art on the periphery, consisting of a multitude of events of different formats. As long-term organizer of Sinopale, the European Cultural Association emphasizes on its sustainable micro-political and emancipatory efforts. The organization works in close co-operation with international curators who are responsible for the selection of the artists and the program. We take a pragmatic and functional collective approach to the exhibition, events and their thematic and discursive direction in order to generate a format of cross-cultural exchange with the local context.

Download press release (PDF, 206 Kb)

Website: http://sinopale.org/