慣性聚合 高效追讀感興趣之博客、新聞、科技資訊
閱原文 以慣性聚合開啟

推薦訂閱源

博客园 - 司徒正美
V
V2EX
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
aimingoo的专栏
aimingoo的专栏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
Blog — PlanetScale
Blog — PlanetScale
A
About on SuperTechFans
月光博客
月光博客
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Martin Fowler
Martin Fowler
博客园 - 聂微东
The GitHub Blog
The GitHub Blog
V
Visual Studio Blog
WordPress大学
WordPress大学
酷 壳 – CoolShell
酷 壳 – CoolShell
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI

Hacker News: Front Page

Trump administration reclassifies cannabis as less dangerous Release raylib v6.0 · raysan5/raylib GitHub - russellromney/honker: SQLite extension + bindings for Postgres NOTIFY/LISTEN semantics with durable queues, streams, pub/sub, and scheduler Writing a C Compiler, in Zig crawshaw - 2026-04-22 MacBook Neo and How the iPad Should Be Convergent Evolution: How Different Language Models Learn Similar Number Representations It's time to reclaim the word "Palantir" for J.R.R. Tolkien Arch Linux now has a bit-for-bit reproducible Docker image Fundamental Theorem of Calculus | David Álvarez Rosa | Personal Website Bring Your Agent to Teams Ars Technica newsroom AI policy France confirms data breach at government agency that manages citizens’ IDs New study compares growing corn for energy to solar production. It's no contest NAEP Long-Term Trend Assessment Results: Reading and Mathematics We found a stable Firefox identifier linking all your private Tor identities GitHub - besimple-oss/broccoli: Broccoli turns Linear tickets into shipped PRs — powered by Claude and Codex, running on your own Google Cloud. Youth Suicides Declined After Creation of National Hotline Top MAGA influencer revealed to be AI — created by a guy in India who made a mint off lonely men online Ping-pong robot beats top-level human players Announcing DuckDB 1.5.2 The handmade beauty of Machine Age data visualizations Treetops glowing during storms captured on film for first time Columnar Storage is Normalization TPU 8t and TPU 8i technical deep dive Our eighth generation TPUs: two chips for the agentic era Introducing Google Cloud Fraud Defense, the next evolution of reCAPTCHA Kernel code removals driven by LLM-created security reports tante.cc Nobody Got Fired for Uber's $8 Million Ledger Mistake? Introducing workspace agents in ChatGPT Sure, xor’ing a register with itself is the idiom for zeroing it out, but why not sub? What Async Promised and What it Delivered — Causality GitHub - justrach/kuri: Browser automation and web crawling for AI agents. Zig-native, token-efficient CDP snapshots, HAR recording, and a standalone fetcher. Drunk Post: Things I’ve Learned as a Senior Engineer Claude Code to be removed from Anthropic's Pro plan? Another Day Has Come ‘Something sinister could be happening’: FBI looks into dead or missing nuclear and space defense scientists tied to NASA, Blue Origin, and SpaceX GitHub - calcom/cal.diy: Scheduling infrastructure for absolutely everyone. Meta to start capturing employee mouse movements, keystrokes for AI training The Vercel Breach: OAuth Supply Chain Attack Exposes the Hidden Risk in Platform Environment Variables Member of Technical Staff, Product Engineering (full-time) at Trellis AI | Y Combinator CATL's new LFP battery can charge from 10 to 98% in less than 7 minutes Jobs at Bloom | Y Combinator The printing press for biological data (Sterling Hooten) Brussels launched an age checking app. Hackers took 2 minutes to break it Inside GitHub's Fake Star Economy The Illuminated Man by Christopher Priest and Nina Allan review – an unconventional portrait of JG Ballard IEA: Solar overtakes all energy sources in a major global first Stripe’s payments APIs: The first 10 years
GitHub - DamRsn/NeuralNote:音效插件,以深度学习实现音频转MIDI转录。
2026-05-24 · via Hacker News: Front Page

神经笔记乃音频插件,能引音至MIDI之转换,当今之至精也入汝所好之數位音樂工作站。

  • 可与任何宫商角徵羽之器相协,声亦在其内。
  • 支持多音节转录
  • 善辨音调之变
  • 轻便迅捷之速录也
  • 听录时可调参数
  • 可直接于插件内缩放并时量化转录之MIDI

安装神经笔记

下载适用于您平台之最新版本于此(Windows、macOS(通用)及Linux皆支持)!

Windows與Mac皆備安裝程序,有獨立版、VST3版及AU(限Mac)版本。此程序允許用戶自選欲安裝之格式。macOS上,程式已簽名,而Windows則否。此即Windows上使用NeuralNote或需多行細步之意。

Linux之用,VST3與獨立應用程式,皆供原生二進制檔。可將檔案複製至適當位置而安裝之。

用法

UI

NeuralNote乃簡易音效插件(VST3/AU/獨立應用程式),用以應用於軌道以進行音訊轉寫。

其流程至簡:

  • 蒐集若干音訊
    • 击录之。实录或DAW中播放轨道,皆可为之。
    • 亦可将音频文件投诸插件。(.wav, .aiff, .flac, .mp3及.ogg(vorbis)皆可受用)。
  • MIDI之转录,立现于钢琴卷帘之中。
  • 击播放之钮,以闻其果。
    • 调其不同之设,以整转录,即便闻之。
    • 各自调适本音之音量,及合成之文字转录之音量
  • 既满意,自插件拖曳至MIDI轨道,导出MIDI转录之文

观吾等所呈之神经音频插件竞赛之视频于此.

神经音符内用Spotify之模型基本音高。欲知详,可阅其博文论文。于NeuralNote中,基本音高以RTNeural行CNN之务,以ONNXRuntime行特征之务(常数Q变换计算+和声堆叠)。此项目之部分也。吾等有所助于RTNeural增二维卷积之支。

自源构建

需求者:gitcmake,及尔之操作系统所偏之编译套件。

此乃克隆之时所用也

git clone --recurse-submodules --shallow-submodules https://github.com/DamRsn/NeuralNote

此操作系统特异构建脚本,须至少执行一次,方可用此项目为常规CMake项目。脚本下载onnxruntime静态库(吾等以ort-builder所创),而后调用CMake。

macOS

$ ./build.sh

Windows

缘于已知之问题,若汝未用 Visual Studio 2022 (MSVC 版本:19.35.x,则须检视 cl 之输出),则当自行构建 onnxruntime.lib,如

  1. 。务须备 Python;若未,则下载数据于 https://www.python.org/downloads/windows/ (此当前不与 Python 3.11 相容,宜用 Python 3.10)。

  2. ,于命令行逐行执行下列诸句:

git clone --depth 1 --recurse-submodules --shallow-submodules https://github.com/tiborvass/libonnxruntime-neuralnote ThirdParty\onnxruntime
cd ThirdParty\onnxruntime
python3 -m venv venv
.\venv\Scripts\activate.bat
pip install -r requirements.txt
.\convert-model-to-ort.bat model.onnx
.\build-win.bat model.required_operators_and_types.with_runtime_opt.config
copy model.with_runtime_opt.ort ..\..\Lib\ModelData\features_model.ort
cd ..\..

尔今可复筑神经笔记,其法若此:

> .\build.bat

集成开发环境

既行构建脚本至少一次,尔可载此项目于所好之集成开发环境(CLion/Visual Studio/VSCode等)中,击‘构建’以成其一目标。

复用神经笔记之转录引擎之码

凡转录所行之码,尽在Lib/Model與諸模倫之重皆在Lib/ModelData/。可自用此段碼於己之項目!後當試將其與倉庫餘部隔絕,為一獨立之庫。

生成Lib/ModelData/中諸檔之碼,今未公開,因其需繁複手作業。然此為吾輩創造此檔所循之序,略述如下:

  • features_model.onnx 乃由 tf2onnx 生成,此法将 keras 模型之 CQT 与和声堆叠部分,取其全基本音图之精髓,并辅以手动添加之权重以调校批量归一化。
  • 所得 .json 文件,载基本音 CNN 之权重,乃由 basic-pitch-ts 仓库中 tensorflow-js 模型所衍。 既化,复以 tf2onnx 转为 onnx。终,手动聚其重,以 .npy 之效,赖 Netron,而后施于一分 keras 之模,以 basic-pitch 之码所创。

原之 basic-pitch CNN,分为四序之模,相联而织,故可由 RTNeural 运之。

报错陈情与功能祈请

若君有涉插件之祈请或遇瑕疵,请于GitHub立案.

协力共襄

协力之至,若欲增插件之能或但求文牍之善,请启PR!

许诺

《神笔》軟體及碼,依阿帕奇二零零許可證發布。觀《許可證》檔《》。《

》第三方庫所採及許可證《

》此乃《神筆》所採第三方庫之列,並其依許可證。《

神笔能否实时录音为文?

惜不可,缘有数事:

  • 基础音调采用恒定Q变换(CQT)为输入特征。此变换需甚长音频片段(> 1秒)以得最低频段之振幅。此致延迟过高,难成实时录文。
  • 基础CNN之迟滞,约加百二十毫秒.
  • 音符事件生成之算法,逆时序而行(自未来溯及往昔),故非因果.

倘有良策,愿闻其详.

鸣谢

神经音符乃达米安·罗森辛蒂博尔·瓦什所创.。 插件之界面,乃沛琳·穆勒所设计也。

。 贡献者

。 多谢诸贡献者!