Anthony Wu

CLIENT

Google Deepmind

ROLES

UX Designer

UX Researcher

COLLABORATORS

Ari Carden

Sherry Chen

Harvin Park

Victoria Qian

TIMELINE

4 months

CONTEXT

How to collect data faster?

As part of my Bachelor in HCI Thesis project, my team and I were challenged to help our client design a faster way to collect accurate alignment data linking sheet music to performance audio. Our client Chris Donahue is researching machine learning methods for generative music. His goal is to enable a broader audience—inclusive of non-musicians—to harness generative music AI.

SOLUTION

An open-source, accessible, application for rapid sheet music and audio alignment

SECONDARY RESEARCH

Existing library for musicians lack aligned music

The International Music Score Library Project or IMSLP, is a digital library of music scores. This digital library is one of the largest collections of public domain music, containing 775,000+ scores and 86,000+ recordings as of September 2024. This is great for musicians as the collection is quite comprehensive; however there is still a large gap of aligned data: mappings between pixels in sheet music and the corresponding timestamps in associated performances. Our client also provided us with a paper written for ISMIR 2022 Conference with more context into previous examinations into sheet music and audio alignment.

“While several existing MIR datasets contain alignments between performances and structured scores (formats like MIDI and MusicXML), no current resources align performances with more commonplace raw-image sheet music...”

Early prototype from previous team developed to provide missing alignment service

MeSA or Measure to Sound Annotator was made as a early proof-of-concept for a tool that would provide this service. Lack of resources that align performances and sheet music can be explained through a number of reasons:

Music elements such as expressive timing and repeats make annotation slow and difficult.
Varying level of granularity when it comes to alignment:
1. Alignment at the piece level is too coarse for any useful application beyond piece recognition.
2. Note level alignment would be useful but be expensive.
3. At the line level, there can be failures due to repeat signs.
Real time alignment is difficult due to expressive timing and annotators unfamiliarity with a piece.

“To overcome these obstacles, we developed an interactive system, MeSA, which leverages off-the-shelf measure and beat detection software to aid musicians in quickly producing measure-level alignments...”

Here you can access a small dataset created using MeSA, MeSA-13.

What our client is working with

Great for listening to

This early prototype took 20 hours to align 13 pieces.

How might we design an intuitive tool to help users to align sheet music and a corresponding recording faster so researchers can generate an accurate dataset to train models for generative music?

COMPETITIVE ANALYSIS

Competition had NO alignment feature or was not the intended goal

Despite our clients claim that there were no competitors, our team conducted an analysis to verify this claim. Of the 6 competitors we found, there was indeed no automatic alignment service provided or alignment did occur, but was never the intended goal of the application.

USER INTERVIEWS

Musicians used both sheet music and audio recordings for practice; but separately

From research there was a huge gap of aligned data and lack of a service that aligned sheet music and audio, my team interviewed musicians our client was working with at the CMU School of Music.

RESEARCH QUESTIONS:

Could you walk us through your practice routine?
How can you tell that you are improving?
Have you tried recording yourself and playing it back?
Are there any resources or tools you used?

THE MAIN INSIGHT

None of the tools used by our interviewees included recorded performances aligned with sheet music

Based on the trends in the affinity map, we noticed that in addition to sheet music, recordings of performances (of themselves or of others) are essential reference material to learning repertoire but are found using other sources

Major Insights

Theme 1: Repetition

Hear performance multiple times to internalize the piece. Repetition builds familiarity.

Note by note, measure by measure, line by line, section by section, page by page, until through the whole piece.

Helps develop and refine other fundamental skills such as sight reading or improvisation.

Theme 2: Practice Rituals

Musicians often look up performance on Spotify / Youtube and look at sheet music while listening.

Pieces are played frequently (daily / weekly) and range from 45 minutes to hours long sessions.

Goal is to play through entire piece without sheet music and no metronome. Able to play in performance conditions.

Theme 4: Recording

Great for listening to sound quality and observing execution of techniques. And to elicit feedback from others.

Best practice is to record yourself. Tracks improvement over time.

Recordings requires too much effort. Especially when learning a piece for the first time.

TESTING AND IMPROVEMENTS

3 main design improvements

From 9 other peers and our advisor, we iterated over our designs with the feedback provided over the span of 4 weeks. In that time, we had 3 major improvements:

Shortening Add Measure Process

Feedback from peers found that adding measure button was far. Making it a hover button that corresponded to the measure was easier.

Based on user testing, users thought the button would add a measure into the sheet music.

Bounding box measures only

Generated corresponding bounding boxes on measure and soundwave.

Suggestions from peers point out the redundancy of having to generate boxes twice.

Also allowed our application to run faster, decreasing alignment process times.

Displaying only sheet music

At first, including soundwave of recorded performance alongside measures to allow user to visually see alignment easier.

User tests with musicians found the soundwave visualization encouraged high granular accuracy in alignment.

Peer feedback suggested removing soundwave to return to measure granularity.

Labeling Measures

Low fidelity prototype to test adjusting measure boxes.

Users all found that dragging to label measures was intuitive

We tested with three types of musicians:

Music educators
Beginner musicians
Advanced musicians

An advanced musician pointed out that each measure is based on the number of beats rather than timestamps. We brought the feedback to our client who confirmed that beat timestamps were not necessary to align sheet music and audio.

From a research point of view, it's much more useful to align noteheads rather than beats

Great for listening to

Great for listening to

Taking a step back to understand the user flow.

Finalizing bounding boxes to label measures

Enabling annotators to add additional boxes as needed

Color coding to indicate algorithm confidence

Labeling repeats

How do repeats work?

Originally, we designed finding repeats as tags on the measure boxes.

However, after more user testing, we learned that users did not have a mental model on how to connect the repeat labels.

Zooming out

We designed a jump annotation workflow with connection points. Based on clicking the starting and ending measure of a jump.

Preview the logic-order measures based on jump labels.

Because adjusting measure boxes and labeling jumps were two core steps in alignment, it led us to design a breadcrumb with all the steps.

THE FINAL SCREENS

Final Product

Seconds to align a piece

Originally, it took 20 hours to align 13 pieces, then with our new alignment interface, it took 30 minutes to align 13 pieces, and with improved measure detection, it just takes seconds to align 1 piece.

Final Figma Prototype

Mid-Fi Prototype

Low-Fi Prototype

Link to full Figma work file here

CONCLUSION AND LESSONS LEARNED

What I would do differently next time

As someone that used to study music and have some exposure to AI / ML, this project was an interesting intersection of the two domains. Not only am I immensely proud of my team’s output, but also for the opportunity of working with our client who operates specifically within this space of music and generative AI. Some takeaways I got:

No such thing as too many iterations. In the beginning stages, our team explored so many different options to try finding the right solution for musicians. Our team drew up several iterations and from there we branched out designing 4 iterations to ensure that every aspect within the application was crafted with intention.

Tradeoffs are always present. An example I can think of was early in the project when we still had audio waves being shown in our early iterations. Showing the changes and really communicating what that means for the user helped not only my team members but our advisors and client understand the rationale behind certain design decisions. I hope to continue this practice and develop this skill in future projects.

Be insight driven. This case study lived on a google document for quite a while, before eventually being published here. That document was about 14 pages long, filled with unnecessary text that didn’t answer the question: “how does this fit into the bigger picture?”. The following iterations of the case study involved a significant culling (about 60%) of content and focusing mainly on the major points throughout the project. Storytelling is an ability that I am still grappling with and by honing in on the insights and influential points, I can create more cohesive narratives.

The exchanges that occurred was incredibly valuable; our client was able to create an application that is not only beneficial to their research but to the general music generative AI community and our team was able to employ the entire UX process while learning best practices in designing with AI / ML in mind. In the end, I believe that I pushed for the application to it’s best state and made sure to not let my own thinking stop me from questioning whether a decision was truly best for the user.

CLIENT

ROLES

COLLABORATORS

TIMELINE

CONTEXT

How to collect data faster?

SOLUTION

An open-source, accessible, application for rapid sheet music and audio alignment

SECONDARY RESEARCH

Existing library for musicians lack aligned music

“While several existing MIR datasets contain alignments between performances and structured scores (formats like MIDI and MusicXML), no current resources align performances with more commonplace raw-image sheet music...”

Early prototype from previous team developed to provide missing alignment service

“To overcome these obstacles, we developed an interactive system, MeSA, which leverages off-the-shelf measure and beat detection software to aid musicians in quickly producing measure-level alignments...”

What our client is working with

How might we design an intuitive tool to help users to align sheet music and a corresponding recording faster so researchers can generate an accurate dataset to train models for generative music?

COMPETITIVE ANALYSIS

Competition had NO alignment feature or was not the intended goal

USER INTERVIEWS

Musicians used both sheet music and audio recordings for practice; but separately

THE MAIN INSIGHT

None of the tools used by our interviewees included recorded performances aligned with sheet music

Major Insights

Theme 1: Repetition

Theme 2: Practice Rituals

Theme 4: Recording

TESTING AND IMPROVEMENTS

3 main design improvements

Shortening Add Measure Process

Bounding box measures only

Displaying only sheet music

Labeling Measures

Users all found that dragging to label measures was intuitive

From a research point of view, it's much more useful to align noteheads rather than beats

Finalizing bounding boxes to label measures

Enabling annotators to add additional boxes as needed

Color coding to indicate algorithm confidence

Labeling repeats

Zooming out

THE FINAL SCREENS

Final Product

Seconds to align a piece

Final Figma Prototype

Mid-Fi Prototype

Low-Fi Prototype

Link to full Figma work file here

CONCLUSION AND LESSONS LEARNED

What I would do differently next time

For work inquiries or to chat with me, email me at anthonywudesign@gmail.com

Thanks for reading~