Research - Medieval Galician-Portuguese Poetry

Mapping Medieval Galician-Portuguese Poets

Between the twelfth and fourteenth centuries, there was a poetic movement in the Galician-Portuguese language throughout the kingdoms of the Iberian Peninsula (with the exception of Catalonia where they used Provençal). Medieval Galician-Portuguese Cantigas (songs) remain as one of the richest parts of Iberian Peninsula's Middle Ages heritage.

One thousand six hundred seventy nine songs written by 187 troubadours and minstrels arrived to us in the form of the Cancioneiros (songbooks). In these project, we analize some of these troubadours through time and space with the intention of gaining a better understanding of how this poetic school had developed from a socio-historic point of view.

The personography

Using the Base de datos da Lírica Profana Galego-Portuguesa (MedDB) as our main source, we selected fifty-two troubadours attending to diachronic criteria. We chose every poet from the first, fourth and five periods: the lower number of poets from these periods forced us to select them all so as to work with a significant corpus. From the second period, we left out the seven poets with the lowest certainty about their biographies. For the selection of the poets from the third period, which represents 75% of the corpus, besides certainty, we added other criteria such as the origin (selecting every poet who wasn't Galician or Portuguese, so other nationalities were represented) or the number of songs preserved (as a way to discern the most important authors).
We used TEI Roma to customize our XML schema (read more at the Text Encoding Initiative site).
For each poet, we marked-up his name, nationality, birthplace, social status, occupation, and affiliation. We looked up this information into two databases: in the MedDB and the Cantigas Medievais Galego Portuguesas. We have also used Wikipedia to get additional biographical information (especially of kings and counts).
We added a @certainty attribute inside every <nationality> element whose values go from low and medium to high. When there were was other pieces of information whose certainty was questioned by our sources, we coded them using again this attribute with values that went from unknown to low and medium.
We used Google Maps and GeoHack to find out the coordinates of every geographical reference we had in the personography. Then we tagged them inside a <geo> element. Since we did not add a <geoDecl> element in the TEI header to define a coordinate system, we marked them up following the TEI default for such cases: the contents of our <geo> element are a pair of numbers separated by whitespace, to be interpreted as latitude followed by longitude according to the World Geodetic System.

To sum up, this is a sample of the coding of one of the poets:

<person xml:id="JGarGlh">
    <persName>Johan Garcia de Guilhade</persName>
    <nationality key="pt" cert="high">Portuguese</nationality>
    <birth>
        <location>
            <placeName>Milhazes, Barcelos</placeName>
            <geo>41.490833,-8.660556</geo>
        </location>
    </birth>
    <socecStatus key="kn">Knight</socecStatus>
    <occupation key="tr">Troubadour</occupation>
    <floruit from="1240" to="1265" period="3">
        <location corresp="#sousa" from="1240" to="1243"/>
        <location corresp="#jaen" from="1244" to="1248"/>
        <location corresp="#afonsX" from="1249" to="1253"/>
        <location corresp="#afonsIII"  from="1253" to="1265"/>
    </floruit>
</person>

We marked-up an ancillary file with every one of the affiliations, which explains the @corresp values inside floruit/location. We specified whether every one of them depended on a royal or a lordship patronage.
```
<place xml:id="sousa" type="noble">
    <location>
        <placeName type="short">The Sousas</placeName>
        <placeName type="full">Court of D. Gonçalo Garcia de Sousa</placeName>
        <geo>41.087778,-8.520278</geo>
    </location>
</place>
```

The KML files

As a first step, we used XSLT for a XML to XML transformation in which we modified the coordinates to the KML standard, that is: three numbers separated by a comma, the first of them being the longitude, followed by the latitude and the altitude. Since the altitude reference was meaningless for our project, we added a 100 value as a default.
```
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>
<xsl:template match="geo">
    <geo>
        <xsl:value-of select="concat(string-join((substring-after(current(), ','),
            substring-before(current(), ',')), ','), ',100')"/>
    </geo>
</xsl:template>
```
Then we employed XSLT again to transform the recently created XML file to KML.

This transformation included two different <Placemark> elements for each poet: one for their birthplace and one for their journey. Inside the second one, the coordinates of every <location> in the personography were inserted as descendants of a <LineString> element so as to draw the paths. Thus, the mark-up for one poet looks like this:

<Placemark>
    <name>Garcia Mendiz d' Eixo</name>
    <description>Eixo, Garcia Mendiz d' Eixo's place of birth</description>
    <Style>
        <IconStyle>
            <Icon>
                <href>http://maps.google.com/mapfiles/kml/paddle/
                    purple-stars.png</href>
            </Icon>
        </IconStyle>
    </Style>
    <Point>
        <coordinates>-8.570415,40.631709,100</coordinates>
    </Point>
</Placemark>
<Placemark>
    <description>Garcia Mendiz d' Eixo's journey</description>
    <styleUrl>#WhiteLine</styleUrl>
    <LineString>
        <tessellate>0</tessellate>
        <coordinates>
            -8.570415,40.631709,100
            -8.47,40.24,100
            -0.883533,41.650192,100 
            -8.520278,41.087778,100 
            -5.570956,42.600636,100
        </coordinates>
    </LineString>
</Placemark>

We stablished also a <Placemark> element for every one of the affiliations:

<Placemark>
    <name>The Trabas</name>
    <description>House of Traba Court (noble court)</description>
    <Style>
        <IconStyle>
            <Icon>
                <href>http://maps.google.com/mapfiles/kml/paddle/red-circle.png</href>
            </Icon>
        </IconStyle>
    </Style>
    <Point>
        <coordinates>-7.71765,42.063,100</coordinates>
    </Point>
</Placemark>

Since we had used @certainty attributes to encode the information that was dubious according to our sources, we referred to those values in our XSLT to create different <IconStyle> elements.

<xsl:template match="person">
    <!--Other template rules-->
    <IconStyle>
        <xsl:choose>
            <xsl:when test="birth[@cert='low']">
                <Icon>
                    <href>http://maps.google.com/mapfiles/kml/paddle/ltblu-blank.png</href>
                </Icon>
            </xsl:when>
            <xsl:when test="birth/location[@cert='unknown'] | birth/location[@cert='low']">
                <Icon>
                    <href>http://maps.google.com/mapfiles/kml/paddle/ltblu-blank.png</href>
                </Icon>
            </xsl:when>
            <xsl:otherwise>
                <Icon>
                    <href>http://maps.google.com/mapfiles/kml/paddle/purple-stars.png</href>
                </Icon>
            </xsl:otherwise>
        </xsl:choose>
    </IconStyle>
    <!--Other template rules-->
</xsl:template>

We used the @period attribute of our personography to create five different KML files so we would have, as a result, one map for each period.
For one of the transformations, instead of using the @period attribute as a filter, we introduced the range of years we had coded inside @from and @to attributes within a <TimeSpan> element.

Embedding the maps in our website

We uploaded the KML files into Google Maps.
Then we opened in the Google Maps Engine API.
After adding an additional description to our maps, we published them, we changed the sharing settings to Public and got the shared link that we introduced in our HTML file inside an <iframe> element.
For embeding the Google Earth globe we used the KML embed gadget.

Other criteria

We specify the level of credibility of each one of our personography entries using @certainty attributes. This makes it possible for us to visually represent that credibility with the intention of making these maps more legitimate from a historiographic point of view.
When we couldn't stablish a birthplace from one of the poets, we used the nationality information as a reference. Then we randomly selected a geographical point for each nationality and employed those as the coordinates for the birthplaces, adding a @certainty="unknown" attribute. For example, the birthplace of every Galician poet we couldn't place is represented in the map as a light blue pinpoint over what we could say it is the geographical center of the Galician territory.
We were more rigorous when dealing with the poets' shifts than with the courts' ones. This means that, for every affiliation, we chose the coordinates of what we could define as the main court and used those sistematically.

Go to top

Social Network Analysis

To find out which were the most relevant cultural centers in the Iberian kingdoms during the Middle Ages we graphed relational patterns through a social network that you can find here.

As we explained in the mapping section, one of the problems of building historical personographies is data accuracy. In our sources, the information we care about is very often introduced with some degree of uncertainty. Moreover, we studying a period of upheaval, defined by conflicts, changes and reorganizations. An easy location reference, such as a particular king's court, becomes a very difficult one when trying to transform it to some coordinates. Kings and noblemen move, and their courts and political influences with them. The references are often unclear, and even if we have tried to be as rigorous as possible, we have had to make some concessions to inaccuracy in favor of functionality.

However, with social network methods you can be much more accurate in some ways when you disregard geographical data and definite time references and keep only the social relationships.

Methodology

To build the social network we used Cytoscape, an open source software platform for visualizing interaction networks. Cytoscape was originally designed for biological research, but now it is a general platform that can be used to visualize and analyze network graphs of any kind involving nodes and edges (i.e. social networks).
We transformed our personography into .csv files using XSLT.
Once we had our data into tables, we uploaded those into Cytoscape and created the initial network.
We defined the columns containing the troubadours and their patrons as nodes. The period was chosen as the interaction value, and then we selected the nationality as an edge attribute. Afterwards, we uploaded the table containing the type of patron (either royal, noble or unknown) as a node attribute.
Cytoscape automatically analyses the data and gives you back an exhaustive report. This means that you can customize your graph according to centrality measures, that is, the relative importance of nodes and edges in the network (the elements that keep it together, in other words).
We customized the graphic elements of the network using Cytoscape buid-in tools. We mapped our data to node color, label, edge thickness, edge color, etc. according to the following configurations:
- Node size and label size depends on degree centrality. The bigger the node is, the highest number of ties to other nodes has.
- The width of the edges is related to edge betweenness, that is the number of shortest paths between pairs of nodes that run along it. The thicker the stroke is, the larger is its influence on the transfer of relations through the network.
- The color of the nodes defines its type. We discern poets from the “venues” where they develop their cultural activity. In this last category, we express when this venue is a royal court, a lord patronage, or an actual geographic place.
- The type of stroke makes it possible to visualize the origin of the poets so, in a way, we still preserve part of our geographical data.

Go to top

Linguistic Analysis

How did the Portuguese pronoun system change throughout the Late Middle Ages?

The Language

Galician-Portuguese was the spoken language in the western area of the Iberian Peninsula until mid-fourteenth century. Derived from Latin, it emerged progressively as a different language before the ninth century, in the territories of the old Kingdom of Galicia, which covered the regions of modern day Galicia and northern Portugal. In this sense, we can say that the word Galician-Portuguese designates not only a language, but also a phase in its development, which later would lead to the differentiation between present Galician and Portuguese languages due to political and sociolinguistic circumstances.

Some History

Portuguese, like most other Romance languages, came from Latin, or more specifically, an altered form of Lating called vulgar Latin. The transition from vulgar Latin to the Romance languages was due to maany factors, notably relative geographic isolation of the different groups of language users and subsequent linguistic superimposures by linguistic subtrate languages. An example of the linguistic superimposure is the intensified stress accent superimposed upon vulgar Latin of the Iberian Peninsula by invading German races. Vulgar Latin, along the western coast of the Iberian Peninsula, which had less of the Germanic stress accent than anywhere else, grew into Portuguese. The exact date of this trasformation is unclear but the earliest documents in Portuguese appeared at the end of the twelfth century and mark the historical beginning of Old Portuguese.

For four centuries, the language underwent amny changes, one of the most important being a marked intensification of the stress accent, which was represented by an increase in syncope found in verse and by the tendency toward greater word individuation. Toward the end of the sixteenth century, nearly all distinct characteristics of Old Portuguese disappeared, leaving the rough equivalent of the modern Portuguese spoken today.

Methodology

For the diachronic variation study, we chose a sample of authors from five time periods used to divide our timeline. Using the work of Williams (1962), we marked up pronouns and categorized them, using Williams's standards, as old or new. Then, we determined how accurate the claims made by Williams are by looking at the distribution of the old and new pronouns across the five time periods.

Williams's claim is that Old Portuguese appeared at the end of the 12th century and had completed its transformation into modern Portuguese by approximately the end of the 16th century. Our goal in looking at pronouns used in poetry of medeival Galician troubadours is to see if Williams's claims are represented in the distribution of pronouns across the poems.

The Galician-Portuguese poetic movement remains a rich cultural event in the history of the Iberian Peninsula. The troubadours and minstrels who authored the songs were greatly revered, as were their works. Therefore, the Portuguese used in their poems could be seen as representative of the language being used at the time. If anything, it is one of the most consistent collections of texts because it spans roughly three centuries during the evolution of the language into modern Portuguese. By tracking the use of archaic and modern pronouns in the poems, we can begin to understand the timeline of how the Portuguese language changed over time.

This, obviously, is just a scratch at the surface of language change. In this project, we only look at personal pronouns because they had quantifiable changes that could be easily recognized and marked. A more in-depth study might include other linguistic features, such as phonological features of orthography. We hope that this project might open the door for more research into this realm of study.

The Markup

After formatting the poems in an XML document, the next step was to mark all the personal pronouns that Williams mentioned in his study. Below are charts taken from Williams's research.

Archaic	Modern
mego, comego, migo	comigo
tego, contego, tigo	contigo
sego, consego, sigo, consigo	consigo
mim	mi
ele	el
eis	eles
nosco	connosco
vosco	convosco
li	lhe
lis, lhis	lhes
che	te
xe	se

In the texts, we used regex to mark all personal pronouns, modern and archaic, with the following tag:

<w ana="#pronoun">pronoun</w>

Importantly, we did not want to miss any pronouns that were hyphenated or had been assimilated into surrounding words. Therefore, we used the expression:

Find: ([^a-z])(mi[m]?)([^a-z])

Replace with: \1<w
                ana="#pronounCode">\2</w>\3

The pronoun code refers to the features library document which is available to view on our GitHub. This allows us to find and calculate the pronouns based on their age classification, which is marked in the feature library. Using these labels, we will be able to calculate the percentage of modern and archaic forms and identify their distribution across the time spectrum of the poets.

Using eXide

First, we counted how many pronouns of all the pronouns we marked were archaic and how many were modern. We used XQuery to do so:

declare namespace tei="http://www.tei-c.org/ns/1.0";
let $docTexts := doc('/course/iberian/data/songTextsThirdTry.xml')/tei:TEI
let $docFeat := doc('/course/iberian/aux/features.xml')/tei:TEI
let $modern := $docTexts//tei:w[substring(@ana,2) = 
    $docFeat//tei:fs[tokenize(@feats/string(),'\s+') = '#archaic']/@xml:id]
return $modern/string()

Then we used another query to filter the results by period:

let $period := $docTexts//tei:div[descendant::tei:affiliation/@corresp='1']
let $modern := $period//tei:w[substring(@ana,2) = 
    $docFeat//tei:fs[tokenize(@feats/string(),'\s+') = '#archaic']/@xml:id]
return $modern

Go to top