You don’t think that will be possible with AI technology?
No. It is not possible even with AI at this time. If the multi track audio were available to you (not 2 track stereo) then maybe it could be done with some time and effort... one track at a time.
I think there is some false expectations about what digital technology can or will be able to do. I know virtual reality is very impressive. You know there is voice recognition software that works for a single voice (it must learn that person's voice), but there is still no software that can identify multiple human voices, sort them out and make a reliable transcription for even a handful of people. Then there is facial recognition software that is becoming more reliable.
If there is any possibility of making a MIDI file from a multi track audio recording, it will be a very long time coming. It would require the ability of recognizing the sound of many acoustic, electric and electronic synth sounds, recognize Pitch Bend and both amplitude and filter modulation, recognize time signatures and key signatures as well. Then the AI would have to have full knowledge of the entire MIDI Specification to map the interpreted audio sounds to MIDI. This would include Bank Select and Program Change for a specific keyboard or sound module... MIDI Volume, Note Velocity, Note Gate time, Pitch Bend, real-time Filter and Pan sweeps, system exclusive strings and data formats for many manufacturers where sysex is used to program DSPs and other special parameters etc.
If it is possible one day... it won't be any time soon.
Joe H