Actually, there are three main types of SMF: Type 0, 1, and 2. Type 2 isn’t much used, but Types 0 and 1 are.

Type 0 has 1 track consisting of up to 16 MIDI channels. The MIDI events are in order of time, regardless of channel (meaning all channels are interleaved together into one MIDI stream, just as real live cable-type hardware MIDI is).

Type 1 has multiple tracks (no real limit) each consisting of up to 16 MIDI channels. While each track has its events in order of time regardless of channel, the tracks are recorded one after another in the file, and remain discrete units. In effect, Type 1 is simply a collection of multiple Type 0 streams, each one called a “Track.”

(Type 2 is to Type 1 what Type 1 is to Type 0. Not really very useful in real life.)

Most MIDI software wants Type 1, with one channel per track, because it makes editing the songs easier. If all the MIDI channels are in one track as per a Type 0 file, then the sequencer software has to spend time separating the events out on a per-channel basis to separate them into individual tracks for editing purposes.

Type 0, on the other hand, can stream-play right off of even an ordinary floppy disk without first having to be loaded entirely into memory, assuming the keyboard or other floppy-equipped device supports this feature (my Yamaha PSR-7000 does). This is simply not possible with Type 1 (let alone Type 2) because the entire file would first need to be processed before any of it could be played, to determine the proper order of MIDI events, since the tracks are placed in sequential order in the file, one after another.

Regardless of type, SMF files can contain any MIDI events, plus special SMF-specific “meta-events.” One obvious meta-event is timing information: since real MIDI streams are done in real-time, there is no need to specify when each MIDI event is to happen. It happens when the event comes down the wire. Not so with a disk file, which has all the events in one file. They may be in sequential order (per track in the case of Types 1 and 2), but that still doesn’t say how much time passes between any two adjacent events. These events are timed in terms of Measure, Beat, and Tick, where the number of Ticks per Beat is defined by the resolution (in PPQ), which is itself a value specified in a meta-event (in the SMF header).

Lyrics and other text information (title, composer, copyright, Track names in the case of Types 1 and 2, etc.) are other types of meta-events. Markers are another (basically text meta-events at specific time points that are general to all tracks), and are, for instance, used in Yamaha StyleFile Format files (which are themselves Type 1 SMF files with additional CASM [used to specify, among other things, how notes in the tracks should be transposed to adjust for the chord being played] information that follows the SMF Type 1 part) to denote where, for instance, Intro A, Intro B, Main A, Main B, Main C, Fill A, Fill B, Ending A, Ending B, etc. are.

While it is true that instruments have their own sets of voice sounds, some additional compatibility is made possible by the General MIDI Level 1 standard, and various extensions to that standard (mostly proprietary such as Roland GS, Yamaha XG, GEM/Baldwin GMX, and Technics NX, but now there is also an official General MIDI Level 2 standard as well). GM means that a part recorded for a timpani sound on one GM Level 1 compatible device or software will not wind up being played by a piccolo sound on another device. It will be played by a timpani sound, regardless of who makes the device (of course, the quality of the timpani sound can vary from device to device but at least it won’t be a drastically wrong instrument!), or, worse yet, a two-handed ragtime piano track won’t wind up being played by a drum kit, or vice-versa (imagine the drum solo to “In-a-Gadda-da-Vida” being MIDIfied and mistakenly played back as seemingly-random notes by some pitched sound like, say, an accordian!).