Our project involves two different research questions, and as such, the methodologies used to answer them are quite diverse.

If you are interested in knowing how we have coded the poets' biographies, check out our mapping section. You can also find there the technical details that explain how we displayed our data using Google Maps and Google Earth.

For more information about the use of social network analysis methods to study the relationships between the poets, visit the social networking section.

Finally, you can find here all the information regarding the linguistic analysis.

Mapping Medieval Galician-Portuguese Poets

Between the twelfth and fourteenth centuries, there was a poetic movement in the Galician-Portuguese language throughout the kingdoms of the Iberian Peninsula (with the exception of Catalonia where they used Provençal). Medieval Galician-Portuguese Cantigas (songs) remain as one of the richest parts of Iberian Peninsula's Middle Ages heritage.

One thousand six hundred seventy nine songs written by 187 troubadours and minstrels arrived to us in the form of the Cancioneiros (songbooks). In these project, we analize some of these troubadours through time and space with the intention of gaining a better understanding of how this poetic school had developed from a socio-historic point of view.

The personography

The KML files

Embedding the maps in our website

Other criteria

Go to top

Social Network Analysis

To find out which were the most relevant cultural centers in the Iberian kingdoms during the Middle Ages we graphed relational patterns through a social network that you can find here.

As we explained in the mapping section, one of the problems of building historical personographies is data accuracy. In our sources, the information we care about is very often introduced with some degree of uncertainty. Moreover, we studying a period of upheaval, defined by conflicts, changes and reorganizations. An easy location reference, such as a particular king's court, becomes a very difficult one when trying to transform it to some coordinates. Kings and noblemen move, and their courts and political influences with them. The references are often unclear, and even if we have tried to be as rigorous as possible, we have had to make some concessions to inaccuracy in favor of functionality.

However, with social network methods you can be much more accurate in some ways when you disregard geographical data and definite time references and keep only the social relationships.


  1. To build the social network we used Cytoscape, an open source software platform for visualizing interaction networks. Cytoscape was originally designed for biological research, but now it is a general platform that can be used to visualize and analyze network graphs of any kind involving nodes and edges (i.e. social networks).
  2. We transformed our personography into .csv files using XSLT.
  3. Once we had our data into tables, we uploaded those into Cytoscape and created the initial network.
  4. We defined the columns containing the troubadours and their patrons as nodes. The period was chosen as the interaction value, and then we selected the nationality as an edge attribute. Afterwards, we uploaded the table containing the type of patron (either royal, noble or unknown) as a node attribute.
  5. Cytoscape automatically analyses the data and gives you back an exhaustive report. This means that you can customize your graph according to centrality measures, that is, the relative importance of nodes and edges in the network (the elements that keep it together, in other words).
  6. We customized the graphic elements of the network using Cytoscape buid-in tools. We mapped our data to node color, label, edge thickness, edge color, etc. according to the following configurations:
    • Node size and label size depends on degree centrality. The bigger the node is, the highest number of ties to other nodes has.
    • The width of the edges is related to edge betweenness, that is the number of shortest paths between pairs of nodes that run along it. The thicker the stroke is, the larger is its influence on the transfer of relations through the network.
    • The color of the nodes defines its type. We discern poets from the “venues” where they develop their cultural activity. In this last category, we express when this venue is a royal court, a lord patronage, or an actual geographic place.
    • The type of stroke makes it possible to visualize the origin of the poets so, in a way, we still preserve part of our geographical data.
Go to top

Linguistic Analysis

How did the Portuguese pronoun system change throughout the Late Middle Ages?

The Language

Galician-Portuguese was the spoken language in the western area of the Iberian Peninsula until mid-fourteenth century. Derived from Latin, it emerged progressively as a different language before the ninth century, in the territories of the old Kingdom of Galicia, which covered the regions of modern day Galicia and northern Portugal. In this sense, we can say that the word Galician-Portuguese designates not only a language, but also a phase in its development, which later would lead to the differentiation between present Galician and Portuguese languages due to political and sociolinguistic circumstances.

Some History

Portuguese, like most other Romance languages, came from Latin, or more specifically, an altered form of Lating called vulgar Latin. The transition from vulgar Latin to the Romance languages was due to maany factors, notably relative geographic isolation of the different groups of language users and subsequent linguistic superimposures by linguistic subtrate languages. An example of the linguistic superimposure is the intensified stress accent superimposed upon vulgar Latin of the Iberian Peninsula by invading German races. Vulgar Latin, along the western coast of the Iberian Peninsula, which had less of the Germanic stress accent than anywhere else, grew into Portuguese. The exact date of this trasformation is unclear but the earliest documents in Portuguese appeared at the end of the twelfth century and mark the historical beginning of Old Portuguese.

For four centuries, the language underwent amny changes, one of the most important being a marked intensification of the stress accent, which was represented by an increase in syncope found in verse and by the tendency toward greater word individuation. Toward the end of the sixteenth century, nearly all distinct characteristics of Old Portuguese disappeared, leaving the rough equivalent of the modern Portuguese spoken today.


For the diachronic variation study, we chose a sample of authors from five time periods used to divide our timeline. Using the work of Williams (1962), we marked up pronouns and categorized them, using Williams's standards, as old or new. Then, we determined how accurate the claims made by Williams are by looking at the distribution of the old and new pronouns across the five time periods.

Williams's claim is that Old Portuguese appeared at the end of the 12th century and had completed its transformation into modern Portuguese by approximately the end of the 16th century. Our goal in looking at pronouns used in poetry of medeival Galician troubadours is to see if Williams's claims are represented in the distribution of pronouns across the poems.

The Galician-Portuguese poetic movement remains a rich cultural event in the history of the Iberian Peninsula. The troubadours and minstrels who authored the songs were greatly revered, as were their works. Therefore, the Portuguese used in their poems could be seen as representative of the language being used at the time. If anything, it is one of the most consistent collections of texts because it spans roughly three centuries during the evolution of the language into modern Portuguese. By tracking the use of archaic and modern pronouns in the poems, we can begin to understand the timeline of how the Portuguese language changed over time.

This, obviously, is just a scratch at the surface of language change. In this project, we only look at personal pronouns because they had quantifiable changes that could be easily recognized and marked. A more in-depth study might include other linguistic features, such as phonological features of orthography. We hope that this project might open the door for more research into this realm of study.

The Markup

After formatting the poems in an XML document, the next step was to mark all the personal pronouns that Williams mentioned in his study. Below are charts taken from Williams's research.

Archaic Modern
mego, comego, migo comigo
tego, contego, tigo contigo
sego, consego, sigo, consigo consigo
mim mi
ele el
eis eles
nosco connosco
vosco convosco
li lhe
lis, lhis lhes
che te
xe se

In the texts, we used regex to mark all personal pronouns, modern and archaic, with the following tag:

<w ana="#pronoun">pronoun</w>

Importantly, we did not want to miss any pronouns that were hyphenated or had been assimilated into surrounding words. Therefore, we used the expression:

Find: ([^a-z])(mi[m]?)([^a-z])
Replace with: \1<w ana="#pronounCode">\2</w>\3

The pronoun code refers to the features library document which is available to view on our GitHub. This allows us to find and calculate the pronouns based on their age classification, which is marked in the feature library. Using these labels, we will be able to calculate the percentage of modern and archaic forms and identify their distribution across the time spectrum of the poets.

Using eXide

First, we counted how many pronouns of all the pronouns we marked were archaic and how many were modern. We used XQuery to do so:

declare namespace tei="";
let $docTexts := doc('/course/iberian/data/songTextsThirdTry.xml')/tei:TEI
let $docFeat := doc('/course/iberian/aux/features.xml')/tei:TEI
let $modern := $docTexts//tei:w[substring(@ana,2) = 
    $docFeat//tei:fs[tokenize(@feats/string(),'\s+') = '#archaic']/@xml:id]
return $modern/string()

Then we used another query to filter the results by period:

let $period := $docTexts//tei:div[descendant::tei:affiliation/@corresp='1']
let $modern := $period//tei:w[substring(@ana,2) = 
    $docFeat//tei:fs[tokenize(@feats/string(),'\s+') = '#archaic']/@xml:id]
return $modern
Go to top