Vladimir Georgiev (The Slavonic and East European Review 44, no. 103, 1960, pp. 285-297)
The formation of a people is the result of continuous and extremely complicated processes. In determining the ethnogenesis of the population of a region, the first question to be answered would seem to be: what is the origin of the tribes or peoples that dwelt in the region concerned in antiquity — that is to say, what were those peoples that represent the substratum of the contemporary ones? The first task to be solved is, therefore, the problem of the substratum, i.e. the problem of the protohistory of the region under examination.
The problem of the genesis of a people may be examined linguistically, historically, archaeologically and ethnologically. As a linguist, I shall try to put forward some linguistic considerations and data towards a solution of this problem in relation to the Balkan peoples.
For the periods for which there are no written documents archaeologists determine regions belonging to the same culture by means of the identification of excavated objects. Linguists use a similar method. The linguist's material consists of toponyms, especially those which present a fairly wide frequency. By specifying a region where a characteristic toponym often appears we are able to delimit a linguistic or ethnic unity. Thus on the basis of the very frequent place names of the type Brighton, Frinton, Honiton, Leamington, Luton, Northampton, Royston, Southampton, Taunton, etc. formed with -ton (= town) an English-speaking region can easily be determined. In the same way, on the basis of frequent toponyms of the type Neustadt, Bernstadt, Heiligenstadt, etc. it is possible to define a German-speaking region (German Stadt 'town', 'city'), and on the basis of the type Belgrad, Stargard, Vyšegrad, etc. a Slavonic-speaking region (Slavonic gardə, grad 'town', 'city').
By means of this method, and also on the basis of various other considerations, linguistics in the last twenty years has achieved very important findings about the ancient population of the Balkan Peninsula. 
We can begin with the north-eastern part of the Balkan Peninsula.
I. The Daco-Mysian Region
There are about fifty toponyms formed as two-stem compound words with
the term dava 'town, city', e.g. Acidava, ,
Cumidava, Rusidava, Sucidava, etc. The number of toponyms of this type
is considerable. This enables us to make some conclusions on the basis
of their geographical distribution. Their distribution is as follows:
Only one is to be found in Thrace, namely Pulpudeva, but this town is said to have been founded by Philip II, king of the Macedonians, who gave it his own name Pulpu-deva (= Philippo-polis 'the- city of Philip'). Hence the name Pulpudeva is not autochthonous in Thrace, but imported from the west where other such toponyms existed.
Place names of this type are therefore characteristic only in Dacia and Mysia and are absent in Thrace.
II. The Thracian Region
In Thrace there are about fifty toponyms formed as two-stem compound words with the term para (probably meaning 'river' or 'brook'), e.g. , Bessapara, , Sauzupara etc.; fourteen toponyms formed as two-stem compound words with the term bria 'town, city', e.g. , etc.; eleven toponyms formed as two-stem compound words with the term diza 'fortress', e.g. Beodizos, Orudiza, , etc. These three types of toponym occur only in Thrace; they do not appear in Dacia, in Mysia or in the western part of the Balkan Peninsula. Besides these there are other toponyms and personal, names that appear only in Thrace or only in Dacia.
From this characteristic geographical distribution of the most frequent toponyms in the eastern part of the Balkan Peninsula (see Fig. 1) an important conclusion emerges. If in Thrace and Dacia the same toponymy was not used, then these two countries must have been inhabited in antiquity by peoples who spoke two different languages, i.e. two different ethnic unities dwelt there. Therefore the Daco-Mysian language was different from the Thracian one. This conclusion is certain, since it is not founded on etymologies that might be of a subjective character, but on geographical distribution which is an objective criterion.
Thus we have separated Dacian (or Daco-Mysian) from Thracian as two
different IE languages, and all other data and considerations support this
conclusion. The study of the phonology of these languages, for example,
proves that they are very different from each other.
III. The Pre-Greek Region
In south and middle Greece a characteristic type of place name or river and mountain name is formed by the suffixes and , e.g.: , etc.; , etc. Thus occurs here as the name of nine different towns, and is the name of nine rivers, etc. These toponyms, hydronyms and oronyms cannot be explained on the basis of Greek — they are not of Greek origin. They belong to a pre-Greek population.
In the same region other such names of non-Greek origin are found, e.g. , a name of five rivers; , a name of five rivers; , a name of four rivers, etc.
The northern boundary of the pre-Greek population followed approximately
the line formed by the rivers Acheloos-Pamisos-Peneios or the mountain
Pindos, since north of this boundary these typical pre-Greek toponyms are
IV. The Proto-Greek Region
The Greeks gradually occupied the Aegean territory from the end of the third or the beginning of the second millennium b.c. But it is still not clear where they dwelt before this invasion, or, in other words, where the proto-Greek region lay.
Study of toponyms shows clearly that this region lay approximately in
north-western Greece. The proofs are as follows:
The original region of the ancient Macedonians was the basin of the river Haliakmon. The oldest toponyms here are very similar to the Greek ones. Numerous isoglosses connect the Macedonian language with different Greek dialects. This fact attests the genetic identity of Macedonian and Greek.
However, there is an essential difference between Macedonian and all other Greek dialects. This is the change of IE ma into ta in Greek which was completed before the epoch of the Mycenaean documents. In Macedonian IE ma changed into m. This difference which separates Macedonian from all other Greek dialects is therefore very old. There are also other differences.
In the present writer's opinion, ancient Macedonian is closely related
to Greek, and Macedonian and Greek are descended from a common Greek-Macedonian
idiom that was spoken till about the second half of the third millennium
VI. The Proto-Phrygian Region
Ancient authors inform us that the Phrygians dwelt formerly in Macedonia () and eastern Illyria. The original region, i.e. the primitive home of the Phrygians, was probably the basin of the river Erigon (), today Černa (or Crna) in northern Macedonia.
After the recent studies of the Phrygian inscriptions of Asia Minor
by O. Haas and R. Gusmani  it is clear that Phrygian
was closely akin to Greek. In the present writer's opinion, Greek, Macedonian
and Phrygian formed in the fourth millennium b.c. a common language. But
when the Phrygians, in about the second half of the second millennium b.c.,
passed gradually over southern Thrace into the north-western part of Asia
Minor, their language was influenced by Thracian and Mysian.
VII. The Illyrian Region
A theory dominant for a long time was that the entire western part of the Balkan Peninsula was inhabited by Illyrians. But the ethnic situation in this part of the Peninsula is not so clear as in its other parts. Here toponyms with a large-scale frequency are not to be found. (in ) appears only four times, also four times and Ulc- (in Ulcisia, Ulcinium) three times.
After the recent studies of H. Kronasser, R. Katicic', Rendic'-Miočevic' and G. Alföldy , it can be regarded as very probable that Illyrian was spoken only in Illyria and some neighbouring regions. In middle Dalmatia another language was spoken, in Liburnia also another one (Liburnian), and the Venetic language was close to Latin, not to Illyrian. In the opinion of the present writer, Illyrian is an IE language intermediate between Venetic and Phrygian. This question still remains open.
Daco-Mysians, Thracians, Greeks, pre-Greeks, Macedonians, Phrygians and Illyrians formed, therefore, the main substratum that underlies the Balkan peoples of today. In the first half of the first millennium b.c. the Greek colonisation began, embracing the eastern and south-eastern shores of the Peninsula. In the first millennium a.d., ancient Thrace gradually became strongly Hellenised.
Towards the end of the first millennium b.c., the Roman conquest of the Balkan Peninsula began, gradually resulting in a partial Romanisation of the northern and north-western zones of the Peninsula. The so-called Jireček line that leads from northern Albania (Lesh) to Serdica (today Sofia) and north of the Balkan mountains as far as the Black Sea separates the two zones of Roman and Greek influence respectively. In the north-eastern part of the Balkan Peninsula, especially in the area of what is now Rumania, invasions of certain Iranian tribes occurred at different times from the 7th century b.c. After the 3rd century a.d. continuous invasions of Goths began, followed by various Turkic tribes as well as Slavs. Between the 6th and the 9th centuries a.d., Slavs occupied large parts of the northern central zone of the Peninsula, and penetrated also into some regions of its southern zone.
After the 14th century, the Peninsula was invaded by the Turks.
The mediaeval and modern history of the Balkan peoples is better known.
Hence I would like to touch briefly only upon two very much disputed problems,
namely, the origin of the Albanians and the Rumanians, and the invasion
of the Slavs.
VIII. Albanians and Rumanians
Whether the Albanians are the successors of die Illyrians or the Thracians is a problem that has long been debated. Today the Albanians dwell in a region that was known in antiquity as Illyria. For that reason the Albanians have often been regarded as the heirs of the ancient Illyrians, although there are no other data supporting such a claim. In the same way, the Bulgarians might be considered as Thracians if the other Slavonic peoples and languages were not known.
But many linguists and historians, e.g. H. Hirt, V. Pârvan, Th. Capidan, A. Philippide, N. Jokl, G. Weigand, P. Skok, D. Detschew, H. Baric', I. Siadbei, etc. have put forward very important considerations indicating that the Albanians cannot be autochthonous in the Albania of today, that their original home was the eastern part of Mysia Superior or approximately Dardania and Dacia Mediterranea, i.e. the northern central zone of the Balkan Peninsula, and part of Dacia.
Now, however, when it is clear that Daco-Mysian and Thracian represent two different IE languages, the problem of the origin of the Albanian language and the Albanians themselves appears in quite a new light. The most important facts and considerations for determining the origin and original home of the Albanians are the following.
1. The Illyrian toponyms known from antiquity, e.g. Shköder from the ancient Scodra (Livius), Tomor from Tomarus (Strabo, Pliny, etc.), have not been directly inherited in Albanian: the contemporary forms of these names do not correspond to the phonetic laws of Albanian. The same also applies to the ancient toponyms of Latin origin in this region.
2. The most ancient loanwords from Latin in Albanian have the phonetic form of eastern Balkan Latin, i.e. of proto-Rumanian, and not of western Balkan Latin, i.e. of old Dalmatian Latin. Albanian, therefore, did not take its borrowings from Vulgar Latin as spoken in Illyria.
3. The Adriatic coast was not part of the primitive home of the Albanians, because the maritime terminology of Albanian is not their own, but is borrowed from different languages.
4. Another indication against local Albanian origin is the insignificant number of ancient Greek loanwords in Albanian. If the primitive home of the Albanians had been Albania itself, then the Albanian language would have to have many more ancient Greek loanwords.
5. The Albanians are not mentioned before the 9th century a.d., although place names and personal names from the whole region of Albania are attested in numerous documents from the 4th century onwards.
6. The old home of the Albanians must have been near to that of the
proto-Rumanians. The oldest Latin elements in Albanian come from proto-Rumanian,
i.e. eastern Balkan Latin, and not from Dalmatian, western Balkan Latin
that was spoken in Illyria. Cf. the phonetic development of the
|Vulgar Latin caballum 'horse'||Rum. cal, Alb. kal|
|Vulgar Latin cubitum 'elbow'||Rum. cot. Alb. kut|
|Vulgar Latin lucta 'struggle, fight'||Rum. luptǎ, Arum. luftǎ, Alb. luftë|
Therefore Albanian did not take shape in Illyria. The agreement in the treatment of Latin words in Rumanian and in Albanian shows that Albanian developed from the 4th till the 6th century in a region where proto-Rumanian was formed.
7. Rumanian possesses about a hundred words which have their correspondences only in Albanian. The form of these Rumanian words is so peculiar (e.g. Rum. mazǎre = Alb. modhullë 'pea(s)') that they cannot be explained as borrowings from Albanian. This is the Dacian substratum in Rumanian, whereas the Albanian correspondences are inherited from Dacian.
The above arguments are well known, but they have not been regarded as sufficient for a definitive solution of the problem. The most important fact to be revealed has been the separation of Daco-Mysian from Thracian. It has thus been established that the phonemic system of Albanian is descended directly from the Daco-Mysian.
Let us consider some examples. The most typical features of the historical
phonology of Albanian are attested in Daco-Mysian. Besides, in Daco-Mysian
there also appear the intermediate phonetic changes that explain the peculiar
phonetic development of Albanian. Here are some samples:
|ē||(ē) > ā > o||o|
|ā||ā > o||o|
|ō||ō > ö > e||e|
|ū||ū > ü||y, i|
IE e > D.-M. ie:
a Dacian tribe is named , but a Thracian one .
Dacian PN Diegis from IE dhegwwh-.
Dacian river name from IE *erðs-.
Dacian word dielina 'Bilsenkraut' from IE *dhel-.
IE ē > D.-M. ē
> ā > o:
IE *dhēwā > D.-M. dēva > dāva > dova, cf. Pulpudeva (4th century b.c.), Buridava (1st century a.d.), Pelendova (after the 4th century a.d.).
> oi > ö > e:
Salmor-ude 'Salt Water', a salt lake in Scythia Minor, in Greek called 'Salt (Lake)' and in Latin palus Salameir; Dacian ude from IE *udo(r) 'water'.
(2nd century a.d.) > Pelendova (after the 4th century a.d.) from *pōl-ōm *dhewā 'Stutt-gart', cf. Alb. pelë 'mare'.
> oi (= ü) > ü (i):
, Moesi, Mysi.
In this way it has been definitively proved that Albanian is descended from Daco-Mysian. Therefore the primitive home of Albanian is a Daco-Mysian region, probably Mysia Superior (Dardania, Dacia Mediterranea) or western Dacia. This fact enables us to explain the numerous typical agreements between Albanian and Rumanian.
Rumanian and Albanian took shape in the Daco-Mysian region;
Rumanian represents a completely Romanised Daco-Mysian and Albanian
a semi-Romanised Daco-Mysian. 
IX. The Slav Invasion
The last problem that I would briefly touch upon is the invasion of the Balkan Peninsula by the southern Slavs.
Byzantine authors inform us that from the 6th and 7th centuries a.d. onwards Slavs occupied large parts of the Balkan Peninsula, but they do not tell us how this invasion was carried out.
In Procopius' book , written about the middle of the 6th century, many toponyms from the Balkan Peninsula appear for the first time. In the 9th century some historians, e.g. M. Drinov and L. Niederle, made the attempt to explain several of them as Slavonic. But as at that time Slavonic onomastics was in its very beginning, this attempt was unconvincing. Today, after serious and thorough study in the field of Slavonic onomastics, Slavonic toponyms can with certainty be determined in the above-mentioned work of Procopius. 
The principles of interpretation are the following. A toponym in Procopius
is Slavonic when it finds genetically identical correspondences in south
Slavonic toponymy and especially in the Slavonic toponyms of Greece. The
comparison with the Slavonic toponyms of Greece is very important, since
the latter appear in the same Greek alphabet. Here are some examples:
These Slavonic names from the first half of the 6th century give us an idea of south Slavonic three centuries before the oldest written Slavonic documents. Hence they are very important for Slavonic philology. They contain some archaic features, such as the lack of the third palatalisation, etc. (e.g. = CS *Gərbьkə > gərbьcь > Bulg. gǎrbec).
The geographical distribution of these toponyms (see Fig. 2) gives us a picture of the oldest infiltration of the Slavs into the Balkan Peninsula. These toponyms are to be found mainly in the region of the rivers Timok-Morava and in the territory of Niš-Sofia. They appear also, but not so frequently, in north-eastern Bulgaria, including the Dobrudza. This means that these two regions constituted the main gates of the Slavonic infiltration into the Peninsula. This is quite natural, because the two regions concerned were difficult to guard even for a well organised Byzantine army; the former is very mountainous and the latter was densely wooded in antiquity.
These results are confirmed by the study of hydronyms, which are very
resistant to change, especially names of large rivers; they often survive
repeated changes of population. Statistics of the names of the larger rivers
in Bulgaria show the following pattern :
|Origin||Large rivers||Medium-sized rivers|
|Thracian (pre-Slavonic)||70 per cent||15 per cent|
|Bulgarian||7 per cent||56 per cent|
The geographical distribution of the Slavonic hydronyms in Bulgaria (see Fig. 3) is as follows. Slavonic names of rivers are frequent in western and north-western Bulgaria, but they are almost missing in eastern and especially in south-eastern Bulgaria. This means that the central zone of the Peninsula was first Slavonicised, while the east followed much later. The statistical patterns of the hydronyms agree, therefore, with the conclusion drawn from the Slavonic toponyms of Procopius.
1. See V. Georgiev, La toponymie ancienne de la péninsule Balkanique et al thèse méditerranéenne, Sofia, 1961; V. Georgiev, Introduzione all storia delle lingue indoeuropee, Rome, 1966, pp. 44 ff., 120 ff. and 175 ff.
2. O. Haas, 'Die sprachgeschichtliche Stellung des Phrygischen' in Ezikovedski izsledvanija v čest na S. Mladenov, Sofia, 1957, pp. 451 ff.; id., 'Die phrygische Sprache im Lichte der Glossen und Namen' (Linguistique Balkanique, II, Sofia, 1960, pp. 25ff.); id., 'Armenier und Phryger’ (Linguistique Balkanique, III, 2, 1961, pp. 29ff.); R. Gusmani, Studi frigi, Milan, 1959, pp. 44ff. and 855 ff; V. Georgiev, Introduzione alia storia delle lingue indeuropee, pp. 149 ff.
3. H. Kronasser, 'Zum Stand der Illyristik’ (Linguistique Balkanique, IV, 1962, pp. 5 ff.); R. Katicic', 'Namengebiete im römischen Dalmatian" (Die Sprache, X, Vienna, 1964, pp. 23 ff.); id., Illyrii proprie dicti (Živa Antika, Skopje, XIII/XIV, 1964, pp. 87 ff.); id., 'Suvremena istraživanja o jeziku starosjedilaca ilirskih provincija' (Naučno društvo SR Bosne i Hercegovine, IV, Sarajevo, 1964, pp. 9 ff.); G. Alföldy, 'Die Namengebung der Urbevölkerung der römischen Provinz Dalmatia’ (Beiträge zur Namenforschung, 15, Heidelberg, 1964, pp. 54 ff).
4. See V. Georgiev, 'Albanisch, Dakisch-Mysisch und Rumanisch. Die Herkunft der Albaner' (Linguistique Balkanique, II, 1960, pp. 1 ff. and pp. 15 ff.); A. Vraciu, ‘Rassuždenija o dakomizijskom substrate rumynskogo jazyka' (Linguistique Balkanique, VIII, 1964, pp. 15 ff.); V. Georgiev, 'Le dace comme substrat de la langue roumaine' (Revue roumaine de linguistique, X, Bucarest, 1965, pp. 75 ff.).
5. See V. Georgiev, 'Naj-starite slavjanski imena na Balkanskija poluostrov i tjanoto značenie za našija ezik i našata istorija' (Bǎlgarski ezik, VIII, Sofia, 1958, pp. 321 ff.).
6. See V. Georgiev, Bǎlgarska
etimologija i onomastika, Sofia, 1960, pp. 21 ff.