dc.description.abstracten |
Music Transcription is a task of converting a musical recording into sheet music for
further reproduction. The problem is still unsolved and requires a high level of
expertise. Most of the works split the task into several subproblems. First of them
is called frame-level transcription, which predicts the set of fundamental frequencies
in the original recording for every frame. This subproblem is the main focus of this
work.
The solution to frame-level transcription is called a piano-roll representation - a bi-
nary matrix which represents whether the given note has been played in the frame
or not. However, most of the approaches do not produce a piano-roll representation
in an end-to-end fashion. They rather output a posteriogram - real matrix with the
same dimensions, which represents the level of uncertainty of whether note has been
played during the frame. Ycart and Benetos, 2018 shows that Long Short Term mem-
ory network can be trained to post-process the posteriograms and improve the piano-roll
representation instead of simply cropping the posteriogram at some value.
In this work, we train more robust LSTM network and experiment with different
types of posteriograms. |
uk |