สร้างเพลงด้วย RNN

บทช่วยสอนนี้แสดงวิธีสร้างโน้ตดนตรีโดยใช้ RNN อย่างง่าย คุณจะฝึกโมเดลโดยใช้คอลเลกชันไฟล์เปียโน MIDI จาก ชุดข้อมูล MAESTRO เมื่อได้รับลำดับของบันทึกย่อ แบบจำลองของคุณจะเรียนรู้การคาดเดาบันทึกย่อถัดไปในลำดับ คุณสามารถสร้างลำดับโน้ตที่ยาวขึ้นได้โดยการเรียกโมเดลซ้ำๆ

บทช่วยสอนนี้มีโค้ดที่สมบูรณ์เพื่อแยกวิเคราะห์และสร้างไฟล์ MIDI คุณสามารถเรียนรู้เพิ่มเติมเกี่ยวกับวิธีการทำงานของ RNN โดยไป ที่ Text generation ด้วย RNN

ติดตั้ง

บทช่วยสอนนี้ใช้ไลบรารี pretty_midi เพื่อสร้างและแยกวิเคราะห์ไฟล์ MIDI และ pyfluidsynth สำหรับสร้างการเล่นเสียงใน Colab

sudo apt install -y fluidsynth

The following packages were automatically installed and are no longer required:
  linux-gcp-5.4-headers-5.4.0-1040 linux-gcp-5.4-headers-5.4.0-1043
  linux-gcp-5.4-headers-5.4.0-1044 linux-gcp-5.4-headers-5.4.0-1049
  linux-headers-5.4.0-1049-gcp linux-image-5.4.0-1049-gcp
  linux-modules-5.4.0-1049-gcp linux-modules-extra-5.4.0-1049-gcp
Use 'sudo apt autoremove' to remove them.
The following additional packages will be installed:
  fluid-soundfont-gm libasyncns0 libdouble-conversion1 libevdev2 libflac8
  libfluidsynth1 libgudev-1.0-0 libinput-bin libinput10 libjack-jackd2-0
  libmtdev1 libogg0 libpulse0 libqt5core5a libqt5dbus5 libqt5gui5
  libqt5network5 libqt5svg5 libqt5widgets5 libqt5x11extras5 libsamplerate0
  libsndfile1 libvorbis0a libvorbisenc2 libwacom-bin libwacom-common libwacom2
  libxcb-icccm4 libxcb-image0 libxcb-keysyms1 libxcb-randr0
  libxcb-render-util0 libxcb-shape0 libxcb-util1 libxcb-xinerama0 libxcb-xkb1
  libxkbcommon-x11-0 qsynth qt5-gtk-platformtheme qttranslations5-l10n
Suggested packages:
  fluid-soundfont-gs timidity jackd2 pulseaudio qt5-image-formats-plugins
  qtwayland5 jackd
The following NEW packages will be installed:
  fluid-soundfont-gm fluidsynth libasyncns0 libdouble-conversion1 libevdev2
  libflac8 libfluidsynth1 libgudev-1.0-0 libinput-bin libinput10
  libjack-jackd2-0 libmtdev1 libogg0 libpulse0 libqt5core5a libqt5dbus5
  libqt5gui5 libqt5network5 libqt5svg5 libqt5widgets5 libqt5x11extras5
  libsamplerate0 libsndfile1 libvorbis0a libvorbisenc2 libwacom-bin
  libwacom-common libwacom2 libxcb-icccm4 libxcb-image0 libxcb-keysyms1
  libxcb-randr0 libxcb-render-util0 libxcb-shape0 libxcb-util1
  libxcb-xinerama0 libxcb-xkb1 libxkbcommon-x11-0 qsynth qt5-gtk-platformtheme
  qttranslations5-l10n
0 upgraded, 41 newly installed, 0 to remove and 120 not upgraded.
Need to get 132 MB of archives.
After this operation, 198 MB of additional disk space will be used.
Get:1 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libogg0 amd64 1.3.2-1 [17.2 kB]
Get:2 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libdouble-conversion1 amd64 2.0.1-4ubuntu1 [33.0 kB]
Get:3 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libqt5core5a amd64 5.9.5+dfsg-0ubuntu2.6 [2035 kB]
Get:4 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libevdev2 amd64 1.5.8+dfsg-1ubuntu0.1 [28.9 kB]
Get:5 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libmtdev1 amd64 1.1.5-1ubuntu3 [13.8 kB]
Get:6 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libgudev-1.0-0 amd64 1:232-2 [13.6 kB]
Get:7 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libwacom-common all 0.29-1 [36.9 kB]
Get:8 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libwacom2 amd64 0.29-1 [17.7 kB]
Get:9 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libinput-bin amd64 1.10.4-1ubuntu0.18.04.2 [11.2 kB]
Get:10 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libinput10 amd64 1.10.4-1ubuntu0.18.04.2 [86.2 kB]
Get:11 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libqt5dbus5 amd64 5.9.5+dfsg-0ubuntu2.6 [195 kB]
Get:12 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libqt5network5 amd64 5.9.5+dfsg-0ubuntu2.6 [634 kB]
Get:13 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libxcb-icccm4 amd64 0.4.1-1ubuntu1 [10.4 kB]
Get:14 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libxcb-util1 amd64 0.4.0-0ubuntu3 [11.2 kB]
Get:15 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libxcb-image0 amd64 0.4.0-1build1 [12.3 kB]
Get:16 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libxcb-keysyms1 amd64 0.4.0-1 [8406 B]
Get:17 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libxcb-randr0 amd64 1.13-2~ubuntu18.04 [16.4 kB]
Get:18 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libxcb-render-util0 amd64 0.3.9-1 [9638 B]
Get:19 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libxcb-shape0 amd64 1.13-2~ubuntu18.04 [5972 B]
Get:20 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libxcb-xinerama0 amd64 1.13-2~ubuntu18.04 [5264 B]
Get:21 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libxcb-xkb1 amd64 1.13-2~ubuntu18.04 [30.1 kB]
Get:22 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libxkbcommon-x11-0 amd64 0.8.2-1~ubuntu18.04.1 [13.4 kB]
Get:23 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libqt5gui5 amd64 5.9.5+dfsg-0ubuntu2.6 [2568 kB]
Get:24 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libqt5widgets5 amd64 5.9.5+dfsg-0ubuntu2.6 [2203 kB]
Get:25 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libqt5svg5 amd64 5.9.5-0ubuntu1.1 [129 kB]
Get:26 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/universe amd64 fluid-soundfont-gm all 3.1-5.1 [119 MB]
Get:27 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libsamplerate0 amd64 0.1.9-1 [938 kB]
Get:28 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libjack-jackd2-0 amd64 1.9.12~dfsg-2 [263 kB]
Get:29 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libasyncns0 amd64 0.8-6 [12.1 kB]
Get:30 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libflac8 amd64 1.3.2-1 [213 kB]
Get:31 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libvorbis0a amd64 1.3.5-4.2 [86.4 kB]
Get:32 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libvorbisenc2 amd64 1.3.5-4.2 [70.7 kB]
Get:33 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libsndfile1 amd64 1.0.28-4ubuntu0.18.04.2 [170 kB]
Get:34 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libpulse0 amd64 1:11.1-1ubuntu7.11 [266 kB]
Get:35 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/universe amd64 libfluidsynth1 amd64 1.1.9-1 [137 kB]
Get:36 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/universe amd64 fluidsynth amd64 1.1.9-1 [20.7 kB]
Get:37 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/universe amd64 libqt5x11extras5 amd64 5.9.5-0ubuntu1 [8596 B]
Get:38 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 libwacom-bin amd64 0.29-1 [4712 B]
Get:39 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/universe amd64 qsynth amd64 0.5.0-2 [191 kB]
Get:40 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates/main amd64 qt5-gtk-platformtheme amd64 5.9.5+dfsg-0ubuntu2.6 [117 kB]
Get:41 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic/main amd64 qttranslations5-l10n all 5.9.5-0ubuntu1 [1485 kB]
Fetched 132 MB in 9s (14.0 MB/s)
Extracting templates from packages: 100%

7[0;23r8[1ASelecting previously unselected package libogg0:amd64.
(Reading database ... 285125 files and directories currently installed.)
Preparing to unpack .../00-libogg0_1.3.2-1_amd64.deb ...
7[24;0fProgress: [  0%] [..........................................................] 8Unpacking libogg0:amd64 (1.3.2-1) ...
7[24;0fProgress: [  1%] [..........................................................] 8Selecting previously unselected package libdouble-conversion1:amd64.
Preparing to unpack .../01-libdouble-conversion1_2.0.1-4ubuntu1_amd64.deb ...
Unpacking libdouble-conversion1:amd64 (2.0.1-4ubuntu1) ...
7[24;0fProgress: [  2%] [#.........................................................] 8Selecting previously unselected package libqt5core5a:amd64.
Preparing to unpack .../02-libqt5core5a_5.9.5+dfsg-0ubuntu2.6_amd64.deb ...
7[24;0fProgress: [  3%] [#.........................................................] 8Unpacking libqt5core5a:amd64 (5.9.5+dfsg-0ubuntu2.6) ...
7[24;0fProgress: [  4%] [##........................................................] 8Selecting previously unselected package libevdev2:amd64.
Preparing to unpack .../03-libevdev2_1.5.8+dfsg-1ubuntu0.1_amd64.deb ...
Unpacking libevdev2:amd64 (1.5.8+dfsg-1ubuntu0.1) ...
7[24;0fProgress: [  5%] [###.......................................................] 8Selecting previously unselected package libmtdev1:amd64.
Preparing to unpack .../04-libmtdev1_1.1.5-1ubuntu3_amd64.deb ...
7[24;0fProgress: [  6%] [###.......................................................] 8Unpacking libmtdev1:amd64 (1.1.5-1ubuntu3) ...
7[24;0fProgress: [  7%] [####......................................................] 8Selecting previously unselected package libgudev-1.0-0:amd64.
Preparing to unpack .../05-libgudev-1.0-0_1%3a232-2_amd64.deb ...
Unpacking libgudev-1.0-0:amd64 (1:232-2) ...
7[24;0fProgress: [  8%] [####......................................................] 8Selecting previously unselected package libwacom-common.
Preparing to unpack .../06-libwacom-common_0.29-1_all.deb ...
7[24;0fProgress: [  9%] [#####.....................................................] 8Unpacking libwacom-common (0.29-1) ...
7[24;0fProgress: [ 10%] [#####.....................................................] 8Selecting previously unselected package libwacom2:amd64.
Preparing to unpack .../07-libwacom2_0.29-1_amd64.deb ...
Unpacking libwacom2:amd64 (0.29-1) ...
7[24;0fProgress: [ 11%] [######....................................................] 8Selecting previously unselected package libinput-bin.
Preparing to unpack .../08-libinput-bin_1.10.4-1ubuntu0.18.04.2_amd64.deb ...
7[24;0fProgress: [ 12%] [#######...................................................] 8Unpacking libinput-bin (1.10.4-1ubuntu0.18.04.2) ...
7[24;0fProgress: [ 13%] [#######...................................................] 8Selecting previously unselected package libinput10:amd64.
Preparing to unpack .../09-libinput10_1.10.4-1ubuntu0.18.04.2_amd64.deb ...
Unpacking libinput10:amd64 (1.10.4-1ubuntu0.18.04.2) ...
7[24;0fProgress: [ 14%] [########..................................................] 8Selecting previously unselected package libqt5dbus5:amd64.
Preparing to unpack .../10-libqt5dbus5_5.9.5+dfsg-0ubuntu2.6_amd64.deb ...
7[24;0fProgress: [ 15%] [########..................................................] 8Unpacking libqt5dbus5:amd64 (5.9.5+dfsg-0ubuntu2.6) ...
7[24;0fProgress: [ 16%] [#########.................................................] 8Selecting previously unselected package libqt5network5:amd64.
Preparing to unpack .../11-libqt5network5_5.9.5+dfsg-0ubuntu2.6_amd64.deb ...
Unpacking libqt5network5:amd64 (5.9.5+dfsg-0ubuntu2.6) ...
7[24;0fProgress: [ 17%] [##########................................................] 8Selecting previously unselected package libxcb-icccm4:amd64.
Preparing to unpack .../12-libxcb-icccm4_0.4.1-1ubuntu1_amd64.deb ...
Unpacking libxcb-icccm4:amd64 (0.4.1-1ubuntu1) ...
7[24;0fProgress: [ 18%] [##########................................................] 8Selecting previously unselected package libxcb-util1:amd64.
Preparing to unpack .../13-libxcb-util1_0.4.0-0ubuntu3_amd64.deb ...
7[24;0fProgress: [ 19%] [###########...............................................] 8Unpacking libxcb-util1:amd64 (0.4.0-0ubuntu3) ...
7[24;0fProgress: [ 20%] [###########...............................................] 8Selecting previously unselected package libxcb-image0:amd64.
Preparing to unpack .../14-libxcb-image0_0.4.0-1build1_amd64.deb ...
Unpacking libxcb-image0:amd64 (0.4.0-1build1) ...
7[24;0fProgress: [ 21%] [############..............................................] 8Selecting previously unselected package libxcb-keysyms1:amd64.
Preparing to unpack .../15-libxcb-keysyms1_0.4.0-1_amd64.deb ...
7[24;0fProgress: [ 22%] [############..............................................] 8Unpacking libxcb-keysyms1:amd64 (0.4.0-1) ...
7[24;0fProgress: [ 23%] [#############.............................................] 8Selecting previously unselected package libxcb-randr0:amd64.
Preparing to unpack .../16-libxcb-randr0_1.13-2~ubuntu18.04_amd64.deb ...
Unpacking libxcb-randr0:amd64 (1.13-2~ubuntu18.04) ...
7[24;0fProgress: [ 24%] [##############............................................] 8Selecting previously unselected package libxcb-render-util0:amd64.
Preparing to unpack .../17-libxcb-render-util0_0.3.9-1_amd64.deb ...
7[24;0fProgress: [ 25%] [##############............................................] 8Unpacking libxcb-render-util0:amd64 (0.3.9-1) ...
7[24;0fProgress: [ 26%] [###############...........................................] 8Selecting previously unselected package libxcb-shape0:amd64.
Preparing to unpack .../18-libxcb-shape0_1.13-2~ubuntu18.04_amd64.deb ...
Unpacking libxcb-shape0:amd64 (1.13-2~ubuntu18.04) ...
7[24;0fProgress: [ 27%] [###############...........................................] 8Selecting previously unselected package libxcb-xinerama0:amd64.
Preparing to unpack .../19-libxcb-xinerama0_1.13-2~ubuntu18.04_amd64.deb ...
7[24;0fProgress: [ 28%] [################..........................................] 8Unpacking libxcb-xinerama0:amd64 (1.13-2~ubuntu18.04) ...
7[24;0fProgress: [ 29%] [################..........................................] 8Selecting previously unselected package libxcb-xkb1:amd64.
Preparing to unpack .../20-libxcb-xkb1_1.13-2~ubuntu18.04_amd64.deb ...
Unpacking libxcb-xkb1:amd64 (1.13-2~ubuntu18.04) ...
7[24;0fProgress: [ 30%] [#################.........................................] 8Selecting previously unselected package libxkbcommon-x11-0:amd64.
Preparing to unpack .../21-libxkbcommon-x11-0_0.8.2-1~ubuntu18.04.1_amd64.deb ...
7[24;0fProgress: [ 31%] [##################........................................] 8Unpacking libxkbcommon-x11-0:amd64 (0.8.2-1~ubuntu18.04.1) ...
7[24;0fProgress: [ 32%] [##################........................................] 8Selecting previously unselected package libqt5gui5:amd64.
Preparing to unpack .../22-libqt5gui5_5.9.5+dfsg-0ubuntu2.6_amd64.deb ...
Unpacking libqt5gui5:amd64 (5.9.5+dfsg-0ubuntu2.6) ...
7[24;0fProgress: [ 33%] [###################.......................................] 8Selecting previously unselected package libqt5widgets5:amd64.
Preparing to unpack .../23-libqt5widgets5_5.9.5+dfsg-0ubuntu2.6_amd64.deb ...
Unpacking libqt5widgets5:amd64 (5.9.5+dfsg-0ubuntu2.6) ...
7[24;0fProgress: [ 34%] [###################.......................................] 8Selecting previously unselected package libqt5svg5:amd64.
Preparing to unpack .../24-libqt5svg5_5.9.5-0ubuntu1.1_amd64.deb ...
7[24;0fProgress: [ 35%] [####################......................................] 8Unpacking libqt5svg5:amd64 (5.9.5-0ubuntu1.1) ...
7[24;0fProgress: [ 36%] [#####################.....................................] 8Selecting previously unselected package fluid-soundfont-gm.
Preparing to unpack .../25-fluid-soundfont-gm_3.1-5.1_all.deb ...
Unpacking fluid-soundfont-gm (3.1-5.1) ...
7[24;0fProgress: [ 37%] [#####################.....................................] 8Selecting previously unselected package libsamplerate0:amd64.
Preparing to unpack .../26-libsamplerate0_0.1.9-1_amd64.deb ...
7[24;0fProgress: [ 38%] [######################....................................] 8Unpacking libsamplerate0:amd64 (0.1.9-1) ...
7[24;0fProgress: [ 39%] [######################....................................] 8Selecting previously unselected package libjack-jackd2-0:amd64.
Preparing to unpack .../27-libjack-jackd2-0_1.9.12~dfsg-2_amd64.deb ...
Unpacking libjack-jackd2-0:amd64 (1.9.12~dfsg-2) ...
7[24;0fProgress: [ 40%] [#######################...................................] 8Selecting previously unselected package libasyncns0:amd64.
Preparing to unpack .../28-libasyncns0_0.8-6_amd64.deb ...
7[24;0fProgress: [ 41%] [#######################...................................] 8Unpacking libasyncns0:amd64 (0.8-6) ...
7[24;0fProgress: [ 42%] [########################..................................] 8Selecting previously unselected package libflac8:amd64.
Preparing to unpack .../29-libflac8_1.3.2-1_amd64.deb ...
Unpacking libflac8:amd64 (1.3.2-1) ...
7[24;0fProgress: [ 43%] [#########################.................................] 8Selecting previously unselected package libvorbis0a:amd64.
Preparing to unpack .../30-libvorbis0a_1.3.5-4.2_amd64.deb ...
7[24;0fProgress: [ 44%] [#########################.................................] 8Unpacking libvorbis0a:amd64 (1.3.5-4.2) ...
7[24;0fProgress: [ 45%] [##########################................................] 8Selecting previously unselected package libvorbisenc2:amd64.
Preparing to unpack .../31-libvorbisenc2_1.3.5-4.2_amd64.deb ...
Unpacking libvorbisenc2:amd64 (1.3.5-4.2) ...
7[24;0fProgress: [ 46%] [##########################................................] 8Selecting previously unselected package libsndfile1:amd64.
Preparing to unpack .../32-libsndfile1_1.0.28-4ubuntu0.18.04.2_amd64.deb ...
7[24;0fProgress: [ 47%] [###########################...............................] 8Unpacking libsndfile1:amd64 (1.0.28-4ubuntu0.18.04.2) ...
7[24;0fProgress: [ 48%] [###########################...............................] 8Selecting previously unselected package libpulse0:amd64.
Preparing to unpack .../33-libpulse0_1%3a11.1-1ubuntu7.11_amd64.deb ...
Unpacking libpulse0:amd64 (1:11.1-1ubuntu7.11) ...
7[24;0fProgress: [ 49%] [############################..............................] 8Selecting previously unselected package libfluidsynth1:amd64.
Preparing to unpack .../34-libfluidsynth1_1.1.9-1_amd64.deb ...
7[24;0fProgress: [ 50%] [#############################.............................] 8Unpacking libfluidsynth1:amd64 (1.1.9-1) ...
Selecting previously unselected package fluidsynth.
Preparing to unpack .../35-fluidsynth_1.1.9-1_amd64.deb ...
7[24;0fProgress: [ 51%] [#############################.............................] 8Unpacking fluidsynth (1.1.9-1) ...
7[24;0fProgress: [ 52%] [##############################............................] 8Selecting previously unselected package libqt5x11extras5:amd64.
Preparing to unpack .../36-libqt5x11extras5_5.9.5-0ubuntu1_amd64.deb ...
Unpacking libqt5x11extras5:amd64 (5.9.5-0ubuntu1) ...
7[24;0fProgress: [ 53%] [##############################............................] 8Selecting previously unselected package libwacom-bin.
Preparing to unpack .../37-libwacom-bin_0.29-1_amd64.deb ...
7[24;0fProgress: [ 54%] [###############################...........................] 8Unpacking libwacom-bin (0.29-1) ...
7[24;0fProgress: [ 55%] [################################..........................] 8Selecting previously unselected package qsynth.
Preparing to unpack .../38-qsynth_0.5.0-2_amd64.deb ...
Unpacking qsynth (0.5.0-2) ...
7[24;0fProgress: [ 56%] [################################..........................] 8Selecting previously unselected package qt5-gtk-platformtheme:amd64.
Preparing to unpack .../39-qt5-gtk-platformtheme_5.9.5+dfsg-0ubuntu2.6_amd64.deb ...
7[24;0fProgress: [ 57%] [#################################.........................] 8Unpacking qt5-gtk-platformtheme:amd64 (5.9.5+dfsg-0ubuntu2.6) ...
7[24;0fProgress: [ 58%] [#################################.........................] 8Selecting previously unselected package qttranslations5-l10n.
Preparing to unpack .../40-qttranslations5-l10n_5.9.5-0ubuntu1_all.deb ...
Unpacking qttranslations5-l10n (5.9.5-0ubuntu1) ...
7[24;0fProgress: [ 59%] [##################################........................] 8Setting up libxcb-xinerama0:amd64 (1.13-2~ubuntu18.04) ...
7[24;0fProgress: [ 60%] [##################################........................] 8Setting up libxcb-render-util0:amd64 (0.3.9-1) ...
7[24;0fProgress: [ 61%] [###################################.......................] 8Setting up libxcb-randr0:amd64 (1.13-2~ubuntu18.04) ...
7[24;0fProgress: [ 62%] [####################################......................] 8Setting up libxcb-icccm4:amd64 (0.4.1-1ubuntu1) ...
7[24;0fProgress: [ 63%] [####################################......................] 8Setting up libasyncns0:amd64 (0.8-6) ...
7[24;0fProgress: [ 64%] [#####################################.....................] 8Setting up libwacom-common (0.29-1) ...
7[24;0fProgress: [ 65%] [#####################################.....................] 8Setting up libdouble-conversion1:amd64 (2.0.1-4ubuntu1) ...
7[24;0fProgress: [ 66%] [######################################....................] 8Setting up libevdev2:amd64 (1.5.8+dfsg-1ubuntu0.1) ...
7[24;0fProgress: [ 67%] [#######################################...................] 8Setting up fluid-soundfont-gm (3.1-5.1) ...
7[24;0fProgress: [ 68%] [#######################################...................] 8Setting up libxcb-util1:amd64 (0.4.0-0ubuntu3) ...
7[24;0fProgress: [ 69%] [########################################..................] 8Setting up libogg0:amd64 (1.3.2-1) ...
7[24;0fProgress: [ 70%] [########################################..................] 8Setting up qttranslations5-l10n (5.9.5-0ubuntu1) ...
7[24;0fProgress: [ 71%] [#########################################.................] 8Setting up libmtdev1:amd64 (1.1.5-1ubuntu3) ...
7[24;0fProgress: [ 72%] [#########################################.................] 8Setting up libxcb-shape0:amd64 (1.13-2~ubuntu18.04) ...
7[24;0fProgress: [ 73%] [##########################################................] 8Setting up libgudev-1.0-0:amd64 (1:232-2) ...
7[24;0fProgress: [ 74%] [###########################################...............] 8Setting up libxcb-keysyms1:amd64 (0.4.0-1) ...
7[24;0fProgress: [ 75%] [###########################################...............] 8Setting up libsamplerate0:amd64 (0.1.9-1) ...
7[24;0fProgress: [ 76%] [############################################..............] 8Setting up libvorbis0a:amd64 (1.3.5-4.2) ...
7[24;0fProgress: [ 77%] [############################################..............] 8Setting up libxcb-xkb1:amd64 (1.13-2~ubuntu18.04) ...
7[24;0fProgress: [ 78%] [#############################################.............] 8Setting up libqt5core5a:amd64 (5.9.5+dfsg-0ubuntu2.6) ...
7[24;0fProgress: [ 79%] [#############################################.............] 8Setting up libqt5dbus5:amd64 (5.9.5+dfsg-0ubuntu2.6) ...
7[24;0fProgress: [ 80%] [##############################################............] 8Setting up libqt5network5:amd64 (5.9.5+dfsg-0ubuntu2.6) ...
7[24;0fProgress: [ 81%] [###############################################...........] 8Setting up libwacom2:amd64 (0.29-1) ...
7[24;0fProgress: [ 82%] [###############################################...........] 8Setting up libxcb-image0:amd64 (0.4.0-1build1) ...
7[24;0fProgress: [ 83%] [################################################..........] 8Setting up libflac8:amd64 (1.3.2-1) ...
Setting up libinput-bin (1.10.4-1ubuntu0.18.04.2) ...
7[24;0fProgress: [ 84%] [################################################..........] 8Setting up libxkbcommon-x11-0:amd64 (0.8.2-1~ubuntu18.04.1) ...
7[24;0fProgress: [ 85%] [#################################################.........] 8Setting up libwacom-bin (0.29-1) ...
7[24;0fProgress: [ 86%] [##################################################........] 8Setting up libjack-jackd2-0:amd64 (1.9.12~dfsg-2) ...
7[24;0fProgress: [ 87%] [##################################################........] 8Setting up libvorbisenc2:amd64 (1.3.5-4.2) ...
7[24;0fProgress: [ 88%] [###################################################.......] 8Setting up libinput10:amd64 (1.10.4-1ubuntu0.18.04.2) ...
7[24;0fProgress: [ 89%] [###################################################.......] 8Setting up libsndfile1:amd64 (1.0.28-4ubuntu0.18.04.2) ...
7[24;0fProgress: [ 90%] [####################################################......] 8Setting up libqt5gui5:amd64 (5.9.5+dfsg-0ubuntu2.6) ...
7[24;0fProgress: [ 91%] [####################################################......] 8Setting up qt5-gtk-platformtheme:amd64 (5.9.5+dfsg-0ubuntu2.6) ...
7[24;0fProgress: [ 92%] [#####################################################.....] 8Setting up libqt5x11extras5:amd64 (5.9.5-0ubuntu1) ...
7[24;0fProgress: [ 93%] [######################################################....] 8Setting up libqt5widgets5:amd64 (5.9.5+dfsg-0ubuntu2.6) ...
7[24;0fProgress: [ 94%] [######################################################....] 8Setting up libpulse0:amd64 (1:11.1-1ubuntu7.11) ...
7[24;0fProgress: [ 95%] [#######################################################...] 8Setting up libqt5svg5:amd64 (5.9.5-0ubuntu1.1) ...
7[24;0fProgress: [ 96%] [#######################################################...] 8Setting up libfluidsynth1:amd64 (1.1.9-1) ...
7[24;0fProgress: [ 97%] [########################################################..] 8Setting up fluidsynth (1.1.9-1) ...
7[24;0fProgress: [ 98%] [########################################################..] 8Setting up qsynth (0.5.0-2) ...
7[24;0fProgress: [ 99%] [#########################################################.] 8Processing triggers for hicolor-icon-theme (0.17-2) ...
Processing triggers for mime-support (3.60ubuntu1) ...
Processing triggers for libc-bin (2.27-3ubuntu1.2) ...
Processing triggers for udev (237-3ubuntu10.50) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...

7[0;24r8[1A[J

pip install --upgrade pyfluidsynth

pip install pretty_midi

import collections
import datetime
import fluidsynth
import glob
import numpy as np
import pathlib
import pandas as pd
import pretty_midi
import seaborn as sns
import tensorflow as tf

from IPython import display
from matplotlib import pyplot as plt
from typing import Dict, List, Optional, Sequence, Tuple

seed = 42
tf.random.set_seed(seed)
np.random.seed(seed)

# Sampling rate for audio playback
_SAMPLING_RATE = 16000

ดาวน์โหลดชุดข้อมูล Maestro

data_dir = pathlib.Path('data/maestro-v2.0.0')
if not data_dir.exists():
  tf.keras.utils.get_file(
      'maestro-v2.0.0-midi.zip',
      origin='https://storage.googleapis.com/magentadata/datasets/maestro/v2.0.0/maestro-v2.0.0-midi.zip',
      extract=True,
      cache_dir='.', cache_subdir='data',
  )

Downloading data from https://storage.googleapis.com/magentadata/datasets/maestro/v2.0.0/maestro-v2.0.0-midi.zip
59244544/59243107 [==============================] - 3s 0us/step
59252736/59243107 [==============================] - 3s 0us/step

ชุดข้อมูลประกอบด้วยไฟล์ MIDI ประมาณ 1,200 ไฟล์

filenames = glob.glob(str(data_dir/'**/*.mid*'))
print('Number of files:', len(filenames))

Number of files: 1282

ประมวลผลไฟล์ MIDI

ขั้นแรก ใช้ pretty_midi เพื่อแยกไฟล์ MIDI ไฟล์เดียวและตรวจสอบรูปแบบของบันทึกย่อ หากคุณต้องการดาวน์โหลดไฟล์ MIDI ด้านล่างเพื่อเล่นบนคอมพิวเตอร์ของคุณ คุณสามารถทำได้ใน colab โดยการเขียน files.download(sample_file)

sample_file = filenames[1]
print(sample_file)

data/maestro-v2.0.0/2013/ORIG-MIDI_02_7_6_13_Group__MID--AUDIO_08_R1_2013_wav--3.midi

สร้างวัตถุ PrettyMIDI สำหรับไฟล์ MIDI ตัวอย่าง

pm = pretty_midi.PrettyMIDI(sample_file)

เล่นไฟล์ตัวอย่าง วิดเจ็ตการเล่นอาจใช้เวลาโหลดหลายวินาที

def display_audio(pm: pretty_midi.PrettyMIDI, seconds=30):
  waveform = pm.fluidsynth(fs=_SAMPLING_RATE)
  # Take a sample of the generated waveform to mitigate kernel resets
  waveform_short = waveform[:seconds*_SAMPLING_RATE]
  return display.Audio(waveform_short, rate=_SAMPLING_RATE)

display_audio(pm)

ทำการตรวจสอบไฟล์ MIDI ใช้เครื่องมือประเภทใดบ้าง?

print('Number of instruments:', len(pm.instruments))
instrument = pm.instruments[0]
instrument_name = pretty_midi.program_to_instrument_name(instrument.program)
print('Instrument name:', instrument_name)

Number of instruments: 1
Instrument name: Acoustic Grand Piano

สารสกัดจากบันทึก

for i, note in enumerate(instrument.notes[:10]):
  note_name = pretty_midi.note_number_to_name(note.pitch)
  duration = note.end - note.start
  print(f'{i}: pitch={note.pitch}, note_name={note_name},'
        f' duration={duration:.4f}')

0: pitch=56, note_name=G#3, duration=0.0352
1: pitch=44, note_name=G#2, duration=0.0417
2: pitch=68, note_name=G#4, duration=0.0651
3: pitch=80, note_name=G#5, duration=0.1693
4: pitch=78, note_name=F#5, duration=0.1523
5: pitch=76, note_name=E5, duration=0.1120
6: pitch=75, note_name=D#5, duration=0.0612
7: pitch=49, note_name=C#3, duration=0.0378
8: pitch=85, note_name=C#6, duration=0.0352
9: pitch=37, note_name=C#2, duration=0.0417

คุณจะใช้ตัวแปรสามตัวเพื่อแสดงบันทึกย่อเมื่อฝึกโมเดล: pitch step และ duration ระดับเสียงคือคุณภาพที่รับรู้ได้ของเสียงเป็นหมายเลขโน้ต MIDI step คือเวลาที่ผ่านไปจากโน้ตก่อนหน้าหรือจุดเริ่มต้นของแทร็ก duration คือระยะเวลาที่โน้ตจะเล่นเป็นวินาที และเป็นความแตกต่างระหว่างเวลาสิ้นสุดโน้ตและเวลาเริ่มต้นโน้ต

แยกบันทึกย่อจากไฟล์ MIDI ตัวอย่าง

def midi_to_notes(midi_file: str) -> pd.DataFrame:
  pm = pretty_midi.PrettyMIDI(midi_file)
  instrument = pm.instruments[0]
  notes = collections.defaultdict(list)

  # Sort the notes by start time
  sorted_notes = sorted(instrument.notes, key=lambda note: note.start)
  prev_start = sorted_notes[0].start

  for note in sorted_notes:
    start = note.start
    end = note.end
    notes['pitch'].append(note.pitch)
    notes['start'].append(start)
    notes['end'].append(end)
    notes['step'].append(start - prev_start)
    notes['duration'].append(end - start)
    prev_start = start

  return pd.DataFrame({name: np.array(value) for name, value in notes.items()})

raw_notes = midi_to_notes(sample_file)
raw_notes.head()

การตีความชื่อบันทึกย่ออาจง่ายกว่าการเสนอชื่อ ดังนั้นคุณสามารถใช้ฟังก์ชันด้านล่างเพื่อแปลงจากค่าระดับเสียงที่เป็นตัวเลขเป็นชื่อบันทึกย่อได้ ชื่อบันทึกย่อแสดงประเภทของบันทึกย่อ หมายเลขโดยบังเอิญและอ็อกเทฟ (เช่น C#4)

get_note_names = np.vectorize(pretty_midi.note_number_to_name)
sample_note_names = get_note_names(raw_notes['pitch'])
sample_note_names[:10]

array(['G#3', 'G#5', 'G#4', 'G#2', 'F#5', 'E5', 'D#5', 'C#3', 'C#6',
       'C#5'], dtype='<U3')

ตัวยึดตำแหน่ง23

ในการแสดงภาพดนตรี ให้พล็อตระดับเสียงของโน้ต เริ่มและสิ้นสุดตามความยาวของแทร็ก (เช่น ม้วนเปียโน) เริ่มด้วยโน้ต 100 ตัวแรก

def plot_piano_roll(notes: pd.DataFrame, count: Optional[int] = None):
  if count:
    title = f'First {count} notes'
  else:
    title = f'Whole track'
    count = len(notes['pitch'])
  plt.figure(figsize=(20, 4))
  plot_pitch = np.stack([notes['pitch'], notes['pitch']], axis=0)
  plot_start_stop = np.stack([notes['start'], notes['end']], axis=0)
  plt.plot(
      plot_start_stop[:, :count], plot_pitch[:, :count], color="b", marker=".")
  plt.xlabel('Time [s]')
  plt.ylabel('Pitch')
  _ = plt.title(title)

plot_piano_roll(raw_notes, count=100)

png

พล็อตโน้ตสำหรับแทร็กทั้งหมด

plot_piano_roll(raw_notes)

png

ตรวจสอบการแจกแจงของตัวแปรโน้ตแต่ละตัว

def plot_distributions(notes: pd.DataFrame, drop_percentile=2.5):
  plt.figure(figsize=[15, 5])
  plt.subplot(1, 3, 1)
  sns.histplot(notes, x="pitch", bins=20)

  plt.subplot(1, 3, 2)
  max_step = np.percentile(notes['step'], 100 - drop_percentile)
  sns.histplot(notes, x="step", bins=np.linspace(0, max_step, 21))

  plt.subplot(1, 3, 3)
  max_duration = np.percentile(notes['duration'], 100 - drop_percentile)
  sns.histplot(notes, x="duration", bins=np.linspace(0, max_duration, 21))

plot_distributions(raw_notes)

png

สร้างไฟล์ MIDI

คุณสามารถสร้างไฟล์ MIDI ของคุณเองได้จากรายการบันทึกย่อโดยใช้ฟังก์ชันด้านล่าง

def notes_to_midi(
  notes: pd.DataFrame,
  out_file: str, 
  instrument_name: str,
  velocity: int = 100,  # note loudness
) -> pretty_midi.PrettyMIDI:

  pm = pretty_midi.PrettyMIDI()
  instrument = pretty_midi.Instrument(
      program=pretty_midi.instrument_name_to_program(
          instrument_name))

  prev_start = 0
  for i, note in notes.iterrows():
    start = float(prev_start + note['step'])
    end = float(start + note['duration'])
    note = pretty_midi.Note(
        velocity=velocity,
        pitch=int(note['pitch']),
        start=start,
        end=end,
    )
    instrument.notes.append(note)
    prev_start = start

  pm.instruments.append(instrument)
  pm.write(out_file)
  return pm

example_file = 'example.midi'
example_pm = notes_to_midi(
    raw_notes, out_file=example_file, instrument_name=instrument_name)

เล่นไฟล์ MIDI ที่สร้างขึ้นและดูว่ามีความแตกต่างหรือไม่

display_audio(example_pm)

เช่นเคย คุณสามารถเขียน files.download(example_file) เพื่อดาวน์โหลดและเล่นไฟล์นี้ได้

สร้างชุดข้อมูลการฝึกอบรม

สร้างชุดข้อมูลการฝึกโดยแยกบันทึกย่อจากไฟล์ MIDI คุณสามารถเริ่มต้นด้วยการใช้ไฟล์จำนวนเล็กน้อย และทดลองกับไฟล์อื่นๆ ในภายหลัง อาจใช้เวลาสองสามนาที

num_files = 5
all_notes = []
for f in filenames[:num_files]:
  notes = midi_to_notes(f)
  all_notes.append(notes)

all_notes = pd.concat(all_notes)

n_notes = len(all_notes)
print('Number of notes parsed:', n_notes)

Number of notes parsed: 23163

ถัดไป สร้าง tf.data.Dataset จากบันทึกย่อที่แยกวิเคราะห์

key_order = ['pitch', 'step', 'duration']
train_notes = np.stack([all_notes[key] for key in key_order], axis=1)

notes_ds = tf.data.Dataset.from_tensor_slices(train_notes)
notes_ds.element_spec

TensorSpec(shape=(3,), dtype=tf.float64, name=None)

คุณจะฝึกโมเดลบนแบทช์ของลำดับโน้ต แต่ละตัวอย่างจะประกอบด้วยลำดับของบันทึกย่อเป็นคุณสมบัติการป้อนข้อมูล และบันทึกย่อถัดไปเป็นป้ายกำกับ ด้วยวิธีนี้ แบบจำลองจะได้รับการฝึกให้คาดเดาบันทึกย่อถัดไปในลำดับ คุณสามารถค้นหาไดอะแกรมที่อธิบายกระบวนการนี้ (และรายละเอียดเพิ่มเติม) ใน การจัดประเภทข้อความด้วย RNN

คุณสามารถใช้ฟังก์ชัน หน้าต่าง ที่มีประโยชน์ซึ่งมีขนาด seq_length เพื่อสร้างคุณลักษณะและป้ายกำกับในรูปแบบนี้

def create_sequences(
    dataset: tf.data.Dataset, 
    seq_length: int,
    vocab_size = 128,
) -> tf.data.Dataset:
  """Returns TF Dataset of sequence and label examples."""
  seq_length = seq_length+1

  # Take 1 extra for the labels
  windows = dataset.window(seq_length, shift=1, stride=1,
                              drop_remainder=True)

  # `flat_map` flattens the" dataset of datasets" into a dataset of tensors
  flatten = lambda x: x.batch(seq_length, drop_remainder=True)
  sequences = windows.flat_map(flatten)

  # Normalize note pitch
  def scale_pitch(x):
    x = x/[vocab_size,1.0,1.0]
    return x

  # Split the labels
  def split_labels(sequences):
    inputs = sequences[:-1]
    labels_dense = sequences[-1]
    labels = {key:labels_dense[i] for i,key in enumerate(key_order)}

    return scale_pitch(inputs), labels

  return sequences.map(split_labels, num_parallel_calls=tf.data.AUTOTUNE)

กำหนดความยาวของลำดับสำหรับแต่ละตัวอย่าง ทดลองด้วยความยาวที่แตกต่างกัน (เช่น 50, 100, 150) เพื่อดูว่าอันไหนใช้ได้ผลดีที่สุดสำหรับข้อมูล หรือใช้ การปรับแต่งไฮเปอร์พารามิเตอร์ ขนาดของคำศัพท์ ( vocab_size ) ถูกตั้งค่าเป็น 128 แทนระดับเสียงทั้งหมดที่สนับสนุนโดย pretty_midi

seq_length = 25
vocab_size = 128
seq_ds = create_sequences(notes_ds, seq_length, vocab_size)
seq_ds.element_spec

(TensorSpec(shape=(25, 3), dtype=tf.float64, name=None),
 {'pitch': TensorSpec(shape=(), dtype=tf.float64, name=None),
  'step': TensorSpec(shape=(), dtype=tf.float64, name=None),
  'duration': TensorSpec(shape=(), dtype=tf.float64, name=None)})

รูปร่างของชุดข้อมูลคือ (100,1) หมายความว่าโมเดลจะรับโน้ต 100 รายการเป็นอินพุต และเรียนรู้ที่จะคาดเดาโน้ตต่อไปนี้เป็นเอาต์พุต

for seq, target in seq_ds.take(1):
  print('sequence shape:', seq.shape)
  print('sequence elements (first 10):', seq[0: 10])
  print()
  print('target:', target)

sequence shape: (25, 3)
sequence elements (first 10): tf.Tensor(
[[0.578125   0.         0.1484375 ]
 [0.390625   0.00130208 0.0390625 ]
 [0.3828125  0.03255208 0.07421875]
 [0.390625   0.08203125 0.14713542]
 [0.5625     0.14973958 0.07421875]
 [0.546875   0.09375    0.07421875]
 [0.5390625  0.12239583 0.04947917]
 [0.296875   0.01692708 0.31119792]
 [0.5234375  0.09895833 0.04036458]
 [0.5078125  0.12369792 0.06380208]], shape=(10, 3), dtype=float64)

target: {'pitch': <tf.Tensor: shape=(), dtype=float64, numpy=67.0>, 'step': <tf.Tensor: shape=(), dtype=float64, numpy=0.1171875>, 'duration': <tf.Tensor: shape=(), dtype=float64, numpy=0.04947916666666652>}

ตัวยึดตำแหน่ง42

รวบรวมตัวอย่าง และกำหนดค่าชุดข้อมูลสำหรับประสิทธิภาพ

batch_size = 64
buffer_size = n_notes - seq_length  # the number of items in the dataset
train_ds = (seq_ds
            .shuffle(buffer_size)
            .batch(batch_size, drop_remainder=True)
            .cache()
            .prefetch(tf.data.experimental.AUTOTUNE))

train_ds.element_spec

(TensorSpec(shape=(64, 25, 3), dtype=tf.float64, name=None),
 {'pitch': TensorSpec(shape=(64,), dtype=tf.float64, name=None),
  'step': TensorSpec(shape=(64,), dtype=tf.float64, name=None),
  'duration': TensorSpec(shape=(64,), dtype=tf.float64, name=None)})

สร้างและฝึกโมเดล

โมเดลจะมีสามเอาต์พุต หนึ่งเอาต์พุตสำหรับตัวแปรโน้ตแต่ละตัว สำหรับ pitch และ duration คุณจะใช้ฟังก์ชันการสูญเสียที่กำหนดเองโดยพิจารณาจากค่าคลาดเคลื่อนกำลังสองเฉลี่ยที่กระตุ้นให้โมเดลแสดงค่าที่ไม่เป็นลบ

def mse_with_positive_pressure(y_true: tf.Tensor, y_pred: tf.Tensor):
  mse = (y_true - y_pred) ** 2
  positive_pressure = 10 * tf.maximum(-y_pred, 0.0)
  return tf.reduce_mean(mse + positive_pressure)

input_shape = (seq_length, 3)
learning_rate = 0.005

inputs = tf.keras.Input(input_shape)
x = tf.keras.layers.LSTM(128)(inputs)

outputs = {
  'pitch': tf.keras.layers.Dense(128, name='pitch')(x),
  'step': tf.keras.layers.Dense(1, name='step')(x),
  'duration': tf.keras.layers.Dense(1, name='duration')(x),
}

model = tf.keras.Model(inputs, outputs)

loss = {
      'pitch': tf.keras.losses.SparseCategoricalCrossentropy(
          from_logits=True),
      'step': mse_with_positive_pressure,
      'duration': mse_with_positive_pressure,
}

optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)

model.compile(loss=loss, optimizer=optimizer)

model.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_1 (InputLayer)           [(None, 25, 3)]      0           []                               
                                                                                                  
 lstm (LSTM)                    (None, 128)          67584       ['input_1[0][0]']                
                                                                                                  
 duration (Dense)               (None, 1)            129         ['lstm[0][0]']                   
                                                                                                  
 pitch (Dense)                  (None, 128)          16512       ['lstm[0][0]']                   
                                                                                                  
 step (Dense)                   (None, 1)            129         ['lstm[0][0]']                   
                                                                                                  
==================================================================================================
Total params: 84,354
Trainable params: 84,354
Non-trainable params: 0
__________________________________________________________________________________________________

การทดสอบฟังก์ชัน model.evaluate คุณจะเห็นว่าการสูญเสีย pitch มากกว่าการสูญเสีย step และ duration อย่างมีนัยสำคัญ โปรดทราบว่า loss คือการสูญเสียทั้งหมดที่คำนวณโดยการรวมการสูญเสียอื่น ๆ ทั้งหมดและปัจจุบันถูกครอบงำโดยการสูญเสีย pitch

losses = model.evaluate(train_ds, return_dict=True)
losses

361/361 [==============================] - 6s 4ms/step - loss: 5.0011 - duration_loss: 0.1213 - pitch_loss: 4.8476 - step_loss: 0.0322
{'loss': 5.001128196716309,
 'duration_loss': 0.12134315073490143,
 'pitch_loss': 4.847629547119141,
 'step_loss': 0.03215572610497475}

วิธีหนึ่งที่ทำให้สมดุลคือการใช้อาร์กิวเมนต์ loss_weights เพื่อคอมไพล์:

model.compile(
    loss=loss,
    loss_weights={
        'pitch': 0.05,
        'step': 1.0,
        'duration':1.0,
    },
    optimizer=optimizer,
)

การ loss จะกลายเป็นผลรวมถ่วงน้ำหนักของการสูญเสียแต่ละรายการ

model.evaluate(train_ds, return_dict=True)

361/361 [==============================] - 2s 4ms/step - loss: 0.3959 - duration_loss: 0.1213 - pitch_loss: 4.8476 - step_loss: 0.0322
{'loss': 0.39588069915771484,
 'duration_loss': 0.12134315073490143,
 'pitch_loss': 4.847629547119141,
 'step_loss': 0.03215572610497475}

ตัวยึดตำแหน่ง53

ฝึกโมเดล.

callbacks = [
    tf.keras.callbacks.ModelCheckpoint(
        filepath='./training_checkpoints/ckpt_{epoch}',
        save_weights_only=True),
    tf.keras.callbacks.EarlyStopping(
        monitor='loss',
        patience=5,
        verbose=1,
        restore_best_weights=True),
]

%%time
epochs = 50

history = model.fit(
    train_ds,
    epochs=epochs,
    callbacks=callbacks,
)

Epoch 1/50
361/361 [==============================] - 4s 5ms/step - loss: 0.3075 - duration_loss: 0.0732 - pitch_loss: 4.0974 - step_loss: 0.0294
Epoch 2/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2950 - duration_loss: 0.0696 - pitch_loss: 3.9526 - step_loss: 0.0278
Epoch 3/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2927 - duration_loss: 0.0682 - pitch_loss: 3.9372 - step_loss: 0.0276
Epoch 4/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2918 - duration_loss: 0.0681 - pitch_loss: 3.9232 - step_loss: 0.0275
Epoch 5/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2874 - duration_loss: 0.0657 - pitch_loss: 3.9079 - step_loss: 0.0264
Epoch 6/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2842 - duration_loss: 0.0653 - pitch_loss: 3.8509 - step_loss: 0.0263
Epoch 7/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2820 - duration_loss: 0.0650 - pitch_loss: 3.8090 - step_loss: 0.0265
Epoch 8/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2806 - duration_loss: 0.0654 - pitch_loss: 3.7903 - step_loss: 0.0257
Epoch 9/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2806 - duration_loss: 0.0651 - pitch_loss: 3.7888 - step_loss: 0.0261
Epoch 10/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2778 - duration_loss: 0.0637 - pitch_loss: 3.7690 - step_loss: 0.0256
Epoch 11/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2762 - duration_loss: 0.0624 - pitch_loss: 3.7704 - step_loss: 0.0253
Epoch 12/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2746 - duration_loss: 0.0616 - pitch_loss: 3.7644 - step_loss: 0.0248
Epoch 13/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2728 - duration_loss: 0.0604 - pitch_loss: 3.7591 - step_loss: 0.0244
Epoch 14/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2710 - duration_loss: 0.0584 - pitch_loss: 3.7573 - step_loss: 0.0247
Epoch 15/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2694 - duration_loss: 0.0574 - pitch_loss: 3.7610 - step_loss: 0.0239
Epoch 16/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2686 - duration_loss: 0.0569 - pitch_loss: 3.7529 - step_loss: 0.0240
Epoch 17/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2695 - duration_loss: 0.0577 - pitch_loss: 3.7486 - step_loss: 0.0243
Epoch 18/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2663 - duration_loss: 0.0560 - pitch_loss: 3.7473 - step_loss: 0.0229
Epoch 19/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2642 - duration_loss: 0.0543 - pitch_loss: 3.7366 - step_loss: 0.0231
Epoch 20/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2691 - duration_loss: 0.0587 - pitch_loss: 3.7421 - step_loss: 0.0233
Epoch 21/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2636 - duration_loss: 0.0547 - pitch_loss: 3.7314 - step_loss: 0.0223
Epoch 22/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2613 - duration_loss: 0.0533 - pitch_loss: 3.7313 - step_loss: 0.0215
Epoch 23/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2595 - duration_loss: 0.0516 - pitch_loss: 3.7219 - step_loss: 0.0218
Epoch 24/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2548 - duration_loss: 0.0493 - pitch_loss: 3.7148 - step_loss: 0.0198
Epoch 25/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2539 - duration_loss: 0.0483 - pitch_loss: 3.7150 - step_loss: 0.0199
Epoch 26/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2526 - duration_loss: 0.0474 - pitch_loss: 3.7138 - step_loss: 0.0196
Epoch 27/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2502 - duration_loss: 0.0460 - pitch_loss: 3.7036 - step_loss: 0.0190
Epoch 28/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2467 - duration_loss: 0.0442 - pitch_loss: 3.6970 - step_loss: 0.0177
Epoch 29/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2458 - duration_loss: 0.0438 - pitch_loss: 3.6938 - step_loss: 0.0172
Epoch 30/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2434 - duration_loss: 0.0418 - pitch_loss: 3.6836 - step_loss: 0.0174
Epoch 31/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2404 - duration_loss: 0.0403 - pitch_loss: 3.6703 - step_loss: 0.0166
Epoch 32/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2421 - duration_loss: 0.0412 - pitch_loss: 3.6833 - step_loss: 0.0168
Epoch 33/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2391 - duration_loss: 0.0399 - pitch_loss: 3.6585 - step_loss: 0.0163
Epoch 34/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2376 - duration_loss: 0.0390 - pitch_loss: 3.6467 - step_loss: 0.0163
Epoch 35/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2403 - duration_loss: 0.0417 - pitch_loss: 3.6448 - step_loss: 0.0164
Epoch 36/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2394 - duration_loss: 0.0417 - pitch_loss: 3.6218 - step_loss: 0.0166
Epoch 37/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2337 - duration_loss: 0.0369 - pitch_loss: 3.6155 - step_loss: 0.0161
Epoch 38/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2320 - duration_loss: 0.0357 - pitch_loss: 3.6080 - step_loss: 0.0158
Epoch 39/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2291 - duration_loss: 0.0353 - pitch_loss: 3.5896 - step_loss: 0.0143
Epoch 40/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2285 - duration_loss: 0.0352 - pitch_loss: 3.5784 - step_loss: 0.0144
Epoch 41/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2276 - duration_loss: 0.0338 - pitch_loss: 3.5928 - step_loss: 0.0142
Epoch 42/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2233 - duration_loss: 0.0316 - pitch_loss: 3.5582 - step_loss: 0.0137
Epoch 43/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2211 - duration_loss: 0.0304 - pitch_loss: 3.5453 - step_loss: 0.0134
Epoch 44/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2206 - duration_loss: 0.0307 - pitch_loss: 3.5396 - step_loss: 0.0129
Epoch 45/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2223 - duration_loss: 0.0322 - pitch_loss: 3.5352 - step_loss: 0.0133
Epoch 46/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2213 - duration_loss: 0.0312 - pitch_loss: 3.5323 - step_loss: 0.0135
Epoch 47/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2240 - duration_loss: 0.0329 - pitch_loss: 3.5405 - step_loss: 0.0142
Epoch 48/50
361/361 [==============================] - 2s 6ms/step - loss: 0.2217 - duration_loss: 0.0322 - pitch_loss: 3.5160 - step_loss: 0.0137
Epoch 49/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2167 - duration_loss: 0.0296 - pitch_loss: 3.4894 - step_loss: 0.0126
Epoch 50/50
361/361 [==============================] - 2s 5ms/step - loss: 0.2142 - duration_loss: 0.0278 - pitch_loss: 3.4757 - step_loss: 0.0126
CPU times: user 2min 16s, sys: 23.9 s, total: 2min 40s
Wall time: 1min 41s

plt.plot(history.epoch, history.history['loss'], label='total loss')
plt.show()

png

สร้างบันทึก

ในการใช้โมเดลเพื่อสร้างบันทึกย่อ ก่อนอื่นคุณต้องระบุลำดับเริ่มต้นของบันทึกย่อ ฟังก์ชันด้านล่างสร้างบันทึกย่อหนึ่งรายการจากลำดับบันทึกย่อ

สำหรับระดับเสียงโน้ต จะดึงตัวอย่างจากการแจกแจงโน้ตแบบ softmax ที่สร้างโดยโมเดล และไม่เพียงแค่เลือกโน้ตที่มีความน่าจะเป็นสูงสุด การเลือกโน้ตที่มีความน่าจะเป็นสูงสุดเสมอจะนำไปสู่การสร้างโน้ตซ้ำ ๆ กัน

พารามิเตอร์ temperature สามารถใช้เพื่อควบคุมการสุ่มของบันทึกย่อที่สร้างขึ้น คุณสามารถดูรายละเอียดเพิ่มเติมเกี่ยวกับอุณหภูมิใน การสร้างข้อความด้วย RNN

def predict_next_note(
    notes: np.ndarray, 
    keras_model: tf.keras.Model, 
    temperature: float = 1.0) -> int:
  """Generates a note IDs using a trained sequence model."""

  assert temperature > 0

  # Add batch dimension
  inputs = tf.expand_dims(notes, 0)

  predictions = model.predict(inputs)
  pitch_logits = predictions['pitch']
  step = predictions['step']
  duration = predictions['duration']

  pitch_logits /= temperature
  pitch = tf.random.categorical(pitch_logits, num_samples=1)
  pitch = tf.squeeze(pitch, axis=-1)
  duration = tf.squeeze(duration, axis=-1)
  step = tf.squeeze(step, axis=-1)

  # `step` and `duration` values should be non-negative
  step = tf.maximum(0, step)
  duration = tf.maximum(0, duration)

  return int(pitch), float(step), float(duration)

ตอนนี้สร้างบันทึกย่อบางส่วน คุณสามารถลองใช้อุณหภูมิและลำดับการเริ่มต้นใน next_notes และดูว่าเกิดอะไรขึ้น

temperature = 2.0
num_predictions = 120

sample_notes = np.stack([raw_notes[key] for key in key_order], axis=1)

# The initial sequence of notes; pitch is normalized similar to training
# sequences
input_notes = (
    sample_notes[:seq_length] / np.array([vocab_size, 1, 1]))

generated_notes = []
prev_start = 0
for _ in range(num_predictions):
  pitch, step, duration = predict_next_note(input_notes, model, temperature)
  start = prev_start + step
  end = start + duration
  input_note = (pitch, step, duration)
  generated_notes.append((*input_note, start, end))
  input_notes = np.delete(input_notes, 0, axis=0)
  input_notes = np.append(input_notes, np.expand_dims(input_note, 0), axis=0)
  prev_start = start

generated_notes = pd.DataFrame(
    generated_notes, columns=(*key_order, 'start', 'end'))

generated_notes.head(10)

ตัวยึดตำแหน่ง60

out_file = 'output.mid'
out_pm = notes_to_midi(
    generated_notes, out_file=out_file, instrument_name=instrument_name)
display_audio(out_pm)

คุณยังสามารถดาวน์โหลดไฟล์เสียงโดยเพิ่มสองบรรทัดด้านล่าง:

from google.colab import files
files.download(out_file)

เห็นภาพบันทึกย่อที่สร้างขึ้น

plot_piano_roll(generated_notes)

png

ตรวจสอบการแจกแจง pitch ทช์ step และ duration

plot_distributions(generated_notes)

png

ในแผนภาพข้างต้น คุณจะสังเกตเห็นการเปลี่ยนแปลงในการแจกแจงตัวแปรโน้ต เนื่องจากมีลูปป้อนกลับระหว่างเอาต์พุตและอินพุตของโมเดล โมเดลจึงมีแนวโน้มที่จะสร้างลำดับเอาต์พุตที่คล้ายคลึงกันเพื่อลดการสูญเสีย สิ่งนี้มีความเกี่ยวข้องเป็นพิเศษสำหรับ step และ duration ซึ่งใช้การสูญเสีย MSE สำหรับ pitch คุณสามารถเพิ่มการสุ่มโดยการเพิ่ม temperature ใน predict_next_note

ขั้นตอนถัดไป

บทช่วยสอนนี้สาธิตกลไกการใช้ RNN เพื่อสร้างลำดับของบันทึกย่อจากชุดข้อมูลของไฟล์ MIDI หากต้องการเรียนรู้เพิ่มเติม คุณสามารถไปที่การ สร้างข้อความที่เกี่ยวข้องอย่างใกล้ชิดด้วยบทช่วยสอน RNN ซึ่งมีไดอะแกรมและคำอธิบายเพิ่มเติม

อีกทางเลือกหนึ่งสำหรับการใช้ RNN สำหรับการสร้างเพลงคือการใช้ GAN แทนที่จะสร้างเสียง วิธีที่ใช้ GAN สามารถสร้างลำดับทั้งหมดพร้อมกันได้ ทีมงาน Magenta ได้ทำงานที่น่าประทับใจในแนวทางนี้ด้วย GANsynth คุณยังสามารถค้นหาโครงการดนตรีและศิลปะที่ยอดเยี่ยมมากมาย และรหัสโอเพนซอร์ซได้บน เว็บไซต์โครงการ Magenta