バケチャンロボがとうとう完成！

すごくたくさんフィラメントムダにした#3dprint pic.twitter.com/DyP9zfSTXA

— HomeMadeGarbage (@H0meMadeGarbage) June 3, 2023

（多くのバケチャン実験体が犠牲に…👻）

そしてﾊﾞｹﾁｬﾝは歩くのみならずｵｼｬﾍﾞﾘもするのです。

バケチャンはしゃべる👻“🫧

— ଳ ଳ ଳ WELCOME脳 ଳ ଳ ଳ (@WE1C0MEN0) April 14, 2023

ﾊﾞｹﾁｬﾝ側の実装方法はまだ未定ですが、
とりあえず「ﾊﾞｹﾁｬﾝからWavをPostする」想定で、
Node-RED で音声認識を試していきます。

OpenAI の Speech to Text API

先日 Node-REDで ChatGPT API を試したのですが、
せっかくなので音声認識も OpenAI を使用する事にしました。

Node-RED で ChatGPT API を試してみる

上記で取得したAPI Keyで、音声認識の「Speech to Text API」を使用します。

Speech to text – OpenAI API

Node-RED + OpenAI で音声認識

フロー

[{"id":"8f2dbdef1112307d","type":"tab","label":"フロー 1","disabled":false,"info":"","env":[]},{"id":"47bc29b0322e9eed","type":"http in","z":"8f2dbdef1112307d","name":"","url":"upload","method":"get","upload":false,"swaggerDoc":"","x":110,"y":160,"wires":[["d2e4a0488c7a34c1"]]},{"id":"d2e4a0488c7a34c1","type":"template","z":"8f2dbdef1112307d","name":"","field":"payload","fieldType":"msg","format":"handlebars","syntax":"mustache","template":"<html>\n<head>\n<meta charset=\"utf-8\">\n<title>アップロード</title>\n</head>\n<body>\n<form action=\"/upload\" method=\"post\" enctype=\"multipart/form-data\">\n<input name=\"file\" type=\"file\" id=\"file\"><br>\n<input type=\"submit\" value=\"アップロード\">\n</form>\n</body>\n</html>","output":"str","x":600,"y":160,"wires":[["79821651c3b17b49"]]},{"id":"79821651c3b17b49","type":"http response","z":"8f2dbdef1112307d","name":"","statusCode":"","headers":{},"x":770,"y":180,"wires":[]},{"id":"6cdc9d06434855a3","type":"http in","z":"8f2dbdef1112307d","name":"","url":"upload","method":"post","upload":true,"swaggerDoc":"","x":110,"y":240,"wires":[["13e1e3a2e2bad19b"]]},{"id":"b6444e2cce2c226c","type":"template","z":"8f2dbdef1112307d","name":"","field":"payload","fieldType":"msg","format":"handlebars","syntax":"mustache","template":"<!doctype html>\n<html>\n<head>\n<meta charset=\"utf-8\">\n<title>結果</title>\n</head>\n<body>\n<p>{{payload}}</p>\n</body>\n</html>","output":"str","x":600,"y":200,"wires":[["79821651c3b17b49"]]},{"id":"7c8ac5a5fc89665d","type":"http request","z":"8f2dbdef1112307d","name":"","method":"POST","ret":"txt","paytoqs":"ignore","url":"https://api.openai.com/v1/audio/transcriptions","tls":"","persist":false,"proxy":"","insecureHTTPParser":false,"authType":"","senderr":false,"headers":[],"x":410,"y":240,"wires":[["7b48149cebb30bb4","b6444e2cce2c226c"]]},{"id":"13e1e3a2e2bad19b","type":"function","z":"8f2dbdef1112307d","name":"function 2","func":"const sound = msg.req.files[0].buffer;\nmsg.payload = {\n    'model': {\n        'value': 'whisper-1'\n    },\n\n    'language': {\n        'value': 'ja'\n    },\n\n    'file': {\n        'value': sound,\n        'options': {\n            'filename': 'test.wav',\n            'Content-Type': 'audio/wav',\n        }\n    }\n};\n\nmsg.headers = {\n    'Authorization': 'Bearer [APIキー]',\n    'Content-Type': 'multipart/form-data'\n};\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","libs":[],"x":260,"y":240,"wires":[["7c8ac5a5fc89665d"]]},{"id":"7b48149cebb30bb4","type":"debug","z":"8f2dbdef1112307d","name":"debug 2","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"payload","targetType":"msg","statusVal":"","statusType":"auto","x":600,"y":260,"wires":[]}]

[{"id":"8f2dbdef1112307d","type":"tab","label":"フロー 1","disabled":false,"info":"","env":[]},{"id":"47bc29b0322e9eed","type":"http in","z":"8f2dbdef1112307d","name":"","url":"upload","method":"get","upload":false,"swaggerDoc":"","x":110,"y":160,"wires":[["d2e4a0488c7a34c1"]]},{"id":"d2e4a0488c7a34c1","type":"template","z":"8f2dbdef1112307d","name":"","field":"payload","fieldType":"msg","format":"handlebars","syntax":"mustache","template":"<html>\n<head>\n<meta charset=\"utf-8\">\n<title>アップロード</title>\n</head>\n<body>\n<form action=\"/upload\" method=\"post\" enctype=\"multipart/form-data\">\n<input name=\"file\" type=\"file\" id=\"file\"><br>\n<input type=\"submit\" value=\"アップロード\">\n</form>\n</body>\n</html>","output":"str","x":600,"y":160,"wires":[["79821651c3b17b49"]]},{"id":"79821651c3b17b49","type":"http response","z":"8f2dbdef1112307d","name":"","statusCode":"","headers":{},"x":770,"y":180,"wires":[]},{"id":"6cdc9d06434855a3","type":"http in","z":"8f2dbdef1112307d","name":"","url":"upload","method":"post","upload":true,"swaggerDoc":"","x":110,"y":240,"wires":[["13e1e3a2e2bad19b"]]},{"id":"b6444e2cce2c226c","type":"template","z":"8f2dbdef1112307d","name":"","field":"payload","fieldType":"msg","format":"handlebars","syntax":"mustache","template":"<!doctype html>\n<html>\n<head>\n<meta charset=\"utf-8\">\n<title>結果</title>\n</head>\n<body>\n<p>{{payload}}</p>\n</body>\n</html>","output":"str","x":600,"y":200,"wires":[["79821651c3b17b49"]]},{"id":"7c8ac5a5fc89665d","type":"http request","z":"8f2dbdef1112307d","name":"","method":"POST","ret":"txt","paytoqs":"ignore","url":"https://api.openai.com/v1/audio/transcriptions","tls":"","persist":false,"proxy":"","insecureHTTPParser":false,"authType":"","senderr":false,"headers":[],"x":410,"y":240,"wires":[["7b48149cebb30bb4","b6444e2cce2c226c"]]},{"id":"13e1e3a2e2bad19b","type":"function","z":"8f2dbdef1112307d","name":"function 2","func":"const sound = msg.req.files[0].buffer;\nmsg.payload = {\n 'model': {\n 'value': 'whisper-1'\n },\n\n 'language': {\n 'value': 'ja'\n },\n\n 'file': {\n 'value': sound,\n 'options': {\n 'filename': 'test.wav',\n 'Content-Type': 'audio/wav',\n }\n }\n};\n\nmsg.headers = {\n 'Authorization': 'Bearer [APIキー]',\n 'Content-Type': 'multipart/form-data'\n};\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","libs":[],"x":260,"y":240,"wires":[["7c8ac5a5fc89665d"]]},{"id":"7b48149cebb30bb4","type":"debug","z":"8f2dbdef1112307d","name":"debug 2","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"payload","targetType":"msg","statusVal":"","statusType":"auto","x":600,"y":260,"wires":[]}]

参考にさせて頂きました 🙏

パンダコノート: アップロードされたファイルを Node-RED で受け取る方法

Node-REDを使ってWhisperで音声認識してChatGPTで回答させてみた – Qiita

ノードについて

ファイルアップロード > POST

バケチャンからPOSTする想定なので、
今回はローカルからファイルをアップロードしてPOSTする形で試したいと思います。

template ノード

<html>
<head>
<meta charset="utf-8">
<title>アップロード</title>
</head>
<body>
<form action="/upload" method="post" enctype="multipart/form-data">
<input name="file" type="file" id="file"><br>
<input type="submit" value="アップロード">
</form>
</body>
</html>

<html>

<head>

<title>アップロード</title>

</head>

<body>

</form>

</body>

</html>

このような表示になります。

WavをAPIにPOSTする

http in ノード

POSTを受け取るノードです。

function ノード

上記で受け取ったファイルをAPIに送るためのデータを生成します。

const sound = msg.req.files[0].buffer;
msg.payload = {
    'model': {
        'value': 'whisper-1'
    },
    'language': {
        'value': 'ja'
    },
    'file': {
        'value': sound,
        'options': {
            'filename': 'test.wav',
            'Content-Type': 'audio/wav',
        }
    }
};
msg.headers = {
    'Authorization': 'Bearer [APIキー]',
    'Content-Type': 'multipart/form-data'
};
return msg;

const sound = msg.req.files[0].buffer;

msg.payload = {

'model': {

'value': 'whisper-1'

'language': {

'value': 'ja'

'file': {

'value': sound,

'options': {

'filename': 'test.wav',

'Content-Type': 'audio/wav',

}

};

msg.headers = {

'Authorization': 'Bearer [APIキー]',

'Content-Type': 'multipart/form-data'

};

return msg;

http request

APIのURLを設定します。

https://api.openai.com/v1/audio/transcriptions

1	https://api.openai.com/v1/audio/transcriptions

template

APIから受け取った結果をHTML側にも表示します。

<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>結果</title>
</head>
<body>
<p>{{payload}}</p>
</body>
</html>

<!doctype html>

<html>

<head>

</head>

<body>

<p>{{payload}}</p>

</body>

</html>

動作確認

オーディオデータ

ラジオの冒頭5秒位を切り出したものを使用しました。

※ラジオ↓

ファイルを選択

http://localhost:1880/upload にアクセス

ファイルを選択し、「アップロード」を押す

数秒後、表示が切り替わる！

すごい！ちゃんと認識されてる！！🙌

途中躓いた箇所（エラー）

動作するまでいくつかエラーで躓いたので記録しておきます。

① APIキーが正しくない

{
    "error": {
        "message": "You didn't provide an API key. You need to provide your API key in an Authorization header using Bearer auth (i.e. Authorization: Bearer YOUR_KEY), or as the password field (with blank username) if you're accessing the API from your browser and are prompted for a username and password. You can obtain an API key from https://platform.openai.com/account/api-keys.",
        "type": "invalid_request_error",
        "param": null,
        "code": null
    }
}

{

"error": {

"message": "You didn't provide an API key. You need to provide your API key in an Authorization header using Bearer auth (i.e. Authorization: Bearer YOUR_KEY), or as the password field (with blank username) if you're accessing the API from your browser and are prompted for a username and password. You can obtain an API key from https://platform.openai.com/account/api-keys.",

"type": "invalid_request_error",

"param": null,

"code": null

}

‘Authorization’: ‘Bearer [APIキー]’ の設定で「Bearer 」を入れてなかったためにエラーが出てしまいました。正しく入れたら上記エラーは解消されました。

② validation error

{
  "error": {
    "message": "1 validation error for Request\nbody -> file\n  Expected UploadFile, received: <class 'str'> (type=value_error)",
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }
}

{