Thoughts on summarization service system design
For a summarization task, there should be an input, in which it’s reduced to a handful of paragraphs. This input is in text format. You don’t necessarily start from a text format though, since the source content can be audio or video files. But this means at the end, the source input has to be converted into a text format, and this involves a transcription task. Transcription means taking an audio, then convert it to text. Luckily these days there are APIs you can use to achieve this. Depending on each API provider, but it’s safe to assume most would support WAVE or FLAC encoding. ...