Typesetting
24/7
Between Legacy and Innovation
Archaeological Data Re-use and Integration, a Case Study from Northern Sudan
Introduction
1This article considers the use of historical survey data in current archaeological research in Sudan . While archaeologists regularly make use of data from previous research, especially when this is published, there is a limited amount of critical engagement with how this data is used, or reflections on methodological approaches. The data are rarely used to their full potential, and in some cases a lack of critical engagement leads to misconceptions or misunderstandings in current archaeological research. Through the example of a region in modern Sudan being investigated by the DiverseNile project (PI Julia Budka), Ludwig-Maximilians-Universität (LMU), Munich, this article will discuss our use and integration of data from the 1970s. The main aim is to discuss the different ways that we make use of this data, the types of questions which we should consider when doing so, and suggest avenues for future research.
2The DiverseNile project is an ideal case study to consider some of these issues, as the project combines a range of archaeological methods and also integrates historical and current archaeological data. As such, most of the discussion below will draw on examples on our use of data from an archaeological survey conducted in Sudan between 1970 and 1975. This survey fits into to a long history of archaeological surveying in the Nile Valley. The remoteness of some of the sites, as well as the ongoing civil war in Sudan, means that we must increasingly rely on existing archaeological data to study this particular region. As will be discussed in more detail, the use of historical data is widespread in archaeological research but it is not always problematised effectively from a methodological perspective.
3First, a brief background to the DiverseNile project and the 1970–1975 survey is provided. The article will then turn to considering previous discussions on historical data, primarily in archaeology. The Results and Discussion sections address some of the more specific considerations when using data in the project.
Background
DiverseNile project
4The aim of the DiverseNile project is to refine the cultural entanglement concept (Stockhammer 2013: 17; Der – Fernandini 2016: 20; Silliman 2016: 31) in Bronze Age[1] Nubia, currently highly biased towards elite and religious sites (Carrano – Girty – Carranos 2009; van Pelt 2013; Budka 2019; Lemos – Budka 2021). The project uses a peripheral region in northern Sudan to reconsider settlement and burial sites without being influenced by existing cultural categorisations and looking at this ‘contact space’ as showing acceptance, appropriation or ignorance of different cultural influences (Stockhammer – Athanassov 2018: 106; Budka 2019). The reconstruction of these cultural encounters also includes a detailed assessment of the material record and analyses of production activities, technologies, and foodways, as well as the investigation of the geological and geomorphological features of the study area. This provides a framework for historical and cultural processes; comparable to other examples around the world (e.g. Dietler 2010: 56; Sulas – Pikirayi 2020; Bubel 2022; Cohen 2022).
Historical Survey 1970–1975
5The DiverseNile research area is in Northern Sudan (the Munich University Attab to Ferka Survey Project (MUAFS) concession) (Fig. 1), this region is both a geographical and a cultural boundary zone (see Budka 2019; Budka 2020 for more detailed discussion). An invaluable resource for the project – especially given its regional scope – is a 1970–1975 archaeological survey undertaken in part of the concession area and surrounding region as a collaboration between the SFDAS (Section Française de la Direction des Antiquités du Soudan) and NCAM (National Corporation for Antiquities and Museums) directed by André Vila and – as named in the publications – assisted by Subhi Iskander Daoud, René Filliol, Alain Fouquet, Francis Geus, Arbab Hassan Hafiz, Sid Ahmed Abd el Magid Kamir, Yves Labre, Bakri Mirghani Makki, Osman Suleiman Mohammed, Yussef Muktar, Osmah el Nur, Gonzague Quivron, Jacques Reinold, François Rodriguez, and Abd el Halim Mohammed Taha (Vila 1975a, Vila 1975b, Vila 1976a, Vila 1976b, Vila 1977a, Vila 1977b, Vila 1977c, Vila 1977d, Vila 1978a, Vila 1978b, Vila 1979). The area stretched from the Dal Cataract, just north of the concession to Nilwatti, an island to the south of the MUAFS concession (Vila 1975a).
6The survey took place between 1970 to 1975 and was published in 15 volumes, the below focuses on the first 11 volumes which present the results of the survey. Volumes 12–15 focus on the necropolis of Missiminia which was studied in more detail than many of the other sites (including excavation) due to local construction work (Vila 1980: 5). All volumes are available on the website of the SFDAS ( Section française de la direction des antiquités du Soudan ), Volume I is an introductory volume which contains detailed information on the survey, including the methods used (Vila 1975a, Vila 1975b, Vila 1976a, Vila 1976b, Vila 1977a, Vila 1977b, Vila 1977c, Vila 1977d, Vila 1978a, Vila 1978b, Vila 1979). Unlike the other volumes, which are exclusively in French, the introduction is also provided in English (Vila 1975a). Volume XI is a concluding volume of the survey which includes some analyses and summaries based on the location, date, and type of sites (Vila 1979).
7The amount information, images, and sketch-plans, varies considerably depending on the nature of the site and the state of preservation. As such one entry can range from a couple of sentences to several pages in length (Fig. 2).
8However, all entries contain at least the following basic information:
  • District (administrative division): Abri (East and West); Amara (East and West); Arnyatta Island; Attab (East and West); Dal; Ferka (East and West); Ginis (East and West); Koyekka; Mograkka (East and West); Morka; Nilwatti Island; Sarkamatto; Tabaj (East and West).
  • Place Name
  • Site Code/Registration Number: this is based on a grid division of the 1:250 000 map of the Sudan Survey. The site code or number is based on the labelling of this grid and assigned an additional number in order of discovery (e.g. 1-A-2 or 3-K-4 etc). In some cases, letters are used to further distinguish between parts of the site (e.g. 2-S-42 A and 2-S-42 B); while numbers are added to distinguish excavated areas or objects collected. For example, 2-W-3/3 is a single excavated grave in site 2-W-3 and 2-W-3/3/1 a small jar found in the excavation of the grave. The sites are all on survey map NF 36-M, which is indicated for each entry.
  • Cadastral Map: 1:50 000 maps which are more precise than the survey map used for the site code. This is either: 28.B; 29.B; 29.C; 30.C; or 30.D.
  • Air Photographs: these are air photographs from the Sudan Survey Department, any archaeological sites are both pinpointed on the photographs themselves and on a scale plan on the back. Only 2 sites are out of shot of these photographs.
  • Initials of the Person Recording the Information
9Some sites were also partially excavated or cleared – for example, through a test pit or the excavation of one or more tombs in some of the cemeteries. At some sites finds were collected, the methods for which are not always specified, although in some cases this was exhaustive or a systematic sampling method (e.g. transects) was applied. In total 462 sites were identified during the survey, although given the different cultural phases present at certain sites, the concluding volume of the survey is based on 544 ‘sites’ – different time periods at the same site being distinguished but removing sites which are based only on a single isolated find (Vila 1979).
10There is a long history of organised archaeological survey in Sudan, starting in 1907 with the expansion of the Aswan Dam (Adams 2007: 48; Ahmed 2012: 254). However, much of this early work focused on salvage archaeology, in regions that would be submerged by various phases of the dams, and were heavily biased towards cemetery archaeology (Adams 2007: 50), as well as other monumental sites (Ahmed 2012: 253). In part, this was due to the personal focus of many of the early archaeologists in the region, as well as a desire to collect objects for western museums (Adams 2004: 112, 2007: 48). Non-salvage surveys increased from the 1950s onwards with a number of surveys in various provinces and Vila’s 1970s survey continued south from the survey started due to the building of the High Dam (Ahmed 2012: 262; Edwards – Mills 2020: 4). Unsurprisingly, these inconsistencies in how, and where, surveys have been conducted in Sudan has led to a certain amount of bias in the types of sites included and regions considered, the focus remaining predominantly on the Nile Valley and to the north of Khartoum (Ahmed 2012: 263). The inconsistencies prevalent in the survey of Lower Nubia (Trigger 1965: 54) are therefore even more exacerbated in Upper Nubia and the rest of modern day Sudan due to various factors linked to historical and political context, academic biases, and economic factors. Like many salvage projects in the region, the focus on archaeological remains as opposed to the lives of the people still living there is problematic (see for example, Carruthers 2022; Näser – Kleinitz 2012), although some projects still managed to run projects on local knowledge and culture (Haberlah 2012; Kleinitz – Näser 2012).
11While the results of many of these surveys were published, others remain only in their archival format, as collected during the projects. This includes the results of the Archaeological Survey of Sudanese Nubia, some of which was only published relatively recently in 2020 (Edwards – Mills 2020). This demonstrates the huge, and in many cases untapped, potential of some of these surveys, which ranges from use in some of the earliest studies of settlement distribution in Sudan (e.g. Trigger 1965) to ongoing projects, such as the DiverseNile project.
Re-Use of Historical Data and Legacy Data
12Legacy data is typically used to refer to data in an obsolete format and need to be altered to be used in digital mapping, such as in GIS (Allison 2008). Many methods have been developed to make effective use of such data, including in the application of modern techniques to old data or evidence and the re-contextualisation of data, and the definition in archaeology has been broadened to refer to existing data from previous archaeological research (Wylie 2017). Despite an increasing emphasis on data sharing in ongoing projects, the growing emphasis on systematic data since the 20th century in archaeological research is highlighted as worthy of much more research and use then has been achieved so far (Kintigh – Altschul – Beaudry et al. 2014: 879; McManamon – Kintigh – Ellison et al. 2017: 246). When understood in their historical contexts they provide an invaluable source of information that has yet to be explored to its full potential. Furthermore, using past archaeological data can help us reflect on current recording practices and data collection, particularly in terms of accessibility, usability, and usefulness.
13A crucial discussion on uses of old data is what constitutes ‘use’ as oppose to ‘re-use’. Some advocate that the term use – no matter who or when – can be used, as long as, the use matches the original reason that the data was collected. For example, in the case of survey data collected to provide information on the location of archaeological sites in a particular region, using the data from that survey to relocate or map those sites would constitute use – as that was the reason the information was recorded in the first place. Whereas re-use would apply in cases where the data are used for a secondary purpose, different from the reasons why the data was originally collected (Zimmerman 2008: 634; Faniel – Kriesberg – Yakel 2016: 1404; Huggett 2018: 96). However, Huggett (Huggett 2018) argues convincingly that use of data should only apply when used by the creator or originator of the data and re-use by someone else. This simplifies the question of original intent and whether subsequent use can ever fully match original purposes or reasons. Despite these discussions, re-use of data is still relatively rare, even in cases where it is made available online and in open-access format (Gartski 2022: 177).
14Similarly to archaeological archives, the ability to contextualise existing data is crucial in order to re-use it both effectively and reliably (Wylie 2017: 219). As I have argued elsewhere, an understanding of a range of contexts are necessary to fully understand and use archaeological archives and records in current research (Ward 2022: 162). These contexts range from the broad historical background, the influence of academic paradigms, and the modern contexts that the data is being considered in (Dallas 2015: 180). Their influence on data collection, use, curation, publication, re-use, and re-interpretation should not be underappreciated. In the case of legacy data, a similar emphasis is placed on the existence of paradata – referring to the recording of processual information – and metadata – descriptive information on the data (e.g. authorship, date of collection, etc.) (Couper 2000: 393). Paradata also includes any manipulation (often digitised or computerised) of the data collected (Huggett 2018), although the importance of providing raw, unprocessed, data is also emphasised where possible (Hart – Barmby – LeBaur et al. 2016: 3; White – Baldrigde – Brym et al. 2013: 3).
15Despite a relatively limited amount of data re-use, data sharing is now widely emphasised and encouraged in archaeological research, although the amount and scale of this does vary (see for example, Opitz – Mogetta – Terrenato 2016; Opitz 2018; Amara West Research Space 2023; Çatalhöyük Research Project 2023). There are an increasing number of online platforms which store and publish archaeological data (e.g. Archaeological Data Service 2023, Open Context 2023) with different amounts of data cleaning included, and the importance of providing the information surrounding data and its collection (i.e. metadata and paradata) has also been widely established (Hart – Barmby – LeBaur et al. 2016: 6; White – Baldrigde – Brym et al. 2013: 2–3). Despite this, there are only limited resources on how this data can and should be reused (Ativi – Kansa – Lev-Tov et al. 2013: 664). A fundamental issue is not just that of re-use but effective data integration and digitising different datasets means they can be more effectively combined (Brancato 2019). A number of challenges have been raised in the integration and comparison of legacy data relating to surveys, which includes a variety of data formats, terminology and a lack of contextual information – including on methodology (Faniel – Kansa – Kansa et al. 2013: 302; Casarotto 2022: 430). Re-use of data has led to attempts at standardisation, accepting that physical accessibility is not the only limiting factor in accessing past data (for example, the development of FAIR [Findability, Accessibility, Interoperability, Reusability] principles [Wilkinson – Dumontier – Aalbersberg et al. 2016; for their application in archaeology see de Haas – Van Leuven 2020; Nicholson – Kansa – Gupta et al. 2023]), as well as an increasing amount of information on best practice in sharing archaeological data and the practicalities in referencing this (e.g. Hart – Barmby – LeBaur et al. 2016; Marwick – Pilaar Birch 2018; White – Baldrigde – Brym et al. 2013). However, this standardisation can be problematic and imposes strict recording practices across archaeological sites, despite the fact that the curation of archaeological records is increasingly taking place in the field (Dallas 2015). Therefore, interactiveness needs to be emphasised just as much as integration with the management of the data going back as far as the design of a particular recording system (Yakel 2007: 338; Dallas 2015: 192; Ward 2022: 172).
Methods
16The following section considers the usability and potential of data from the 1970–1975 survey in the DiverseNile project. In line with the above, the recording methods used during the survey are first discussed to showcase how the data can be integrated into current research project and practices.
17The recording system used in the 1970s survey is partly based on conventions outlined by previous surveys in Sudan. For example, the site code and registration number are based on the conventions outlined by previous surveys in Sudan based on maps of Sudan at a 1:1.000.000 and 1:250.000 scale which were divided into grids based on degrees of longitude and latitude (Fig. 3) (Adams 1961: 8; Vila 1975a: 23; Hinkel 1977: 24–28). The survey area NF-36-M is derived from this, with NF 36 based on the 1:1.000.000 grid (longitude: 30°–36° and latitude 20°–24°), and -M based on a further subdivision (longitude: 30°–31°30’ and latitude 20°–21°). Fig. 3 shows these grids superimposed on modern satellite imagery.
18The area (NF 36-M) is then further sub-divided into 25, 15x15 minute squares (which are numbered), and then further into 25, 3x3 minute squares (assigned a letter in alphabetical order) (Fig. 4). For example, square 30°15–30°30 x 20°45–21° is numbered 2 (giving the first number of the reference for any site found in the 15 x 15min square, e.g. 2-S-54), within that square 30°24–30°27 x 20°48–20°51 is labelled S (giving the letter of the reference for any sites found in the 3x3min square, e.g. 2-S-54). The last number of the reference is unique to each site and based on the order of discovery within each 3 minute area (e.g. 2-S-54) (Adams 1961: 8; Vila 1975a: 23). The same grid system is still used in more recent survey projects in Sudan (see for example the Merowe Dam Archaeological Salvage Project. The SARS Amri to Kirbekan Survey 1999–2007 [Welsby 2023]).
19On recording, each site was assigned a symbol based on its type – settlement (S), cemetery (C), or rock art (R) – and its extent and/or state of preservation – 1 to 5 (Fig. 5) (Vila 1975a: 21–22). This and the paperwork used for the recording of sites in the area by Vila and his team is bespoke to this survey.
20Thanks to the grid-system used it is possible to manually georeference the plans published in the survey volumes in QGIS (Fig. 6). The detailed maps for each district include all of the sites surveyed in the 1970s, it is therefore possible to integrate their rough location into the DiverseNile projects GIS. As discovered on the ground these are not always fully accurate but they do provide enough information on location to help reidentify sites, and are not so disparate that they do not allow for the distribution of sites in the concession area. Even without groundtruthing the entry of the site locations and classification into a GIS database makes the 1970s survey data much more accessible to both the project and, hopefully, to future researchers.
Results
Use and integration
21The DiverseNile focuses on Bronze Age sites in the region, using data from the relevant parts of the 1970s survey (Vila 1976a; Vila 1976b; Vila 1977a; Vila 1977b) as a starting point but also additional relevant sites identified on the ground, and the hinterland beyond the original survey area. Fig. 7 shows the 312[2] sites included so far in the GIS database for the MUAFS concession. 219 sites are from the 1970s survey (163 of which have been relocated and updated by the PI so far) and 95 additional sites have also been identified by the PI. The Bronze Age sites include 35 of those newly identified and 61 from the 1970s survey.
22Comparing the data available for the concession area and the region surveyed as a whole in the 1970s demonstrated that the distribution of sites area is broadly similar. This confirms that the smaller concession area being studied by the DiverseNile project is representative of the region as a whole (Fig. 8).
23To give more specific examples of some of the sites re-identified as part of the project, 2-S-54, was recorded in 1973 as a rectangular structure, built of a mixture of mudbrick and schist, a second circular stone structure is described to the north (Fig. 2). Both of these are still clearly visible today and the dimensions match (Fig. 9). This is a good example of a site that is very clearly the same as the one recorded in the 1970s and can be relatively easily confirmed as such and groundtruthed. This means that any relevant information can be updated and included in the DiverseNile GIS database. As the site was also excavated in 2023 (see Budka – Rose – Ward 2023) more detailed information will also be integrated with the data originally collected in the 1970s.
24In other cases, this re-identification is slightly less straightforward, for example AtW 001 (for more information on the site see Budka 2022; Budka – Rose – Ward 2023). As can be seen in Fig. 10, the most extensive stage of this site was a midden forming a small mound which included a large amount of pottery, some of which is intact, and animal bone. Some remains of mudbricks were also found which suggests there may once have been standing architecture at the site but no remains were found in situ. However, the only small brick construction identified by the 1970s survey, preserved up to one course, differs significantly in size. Before this building was reidentified remotely in 2024 (see Budka 2024) it was unclear if AtW 001 was a ‘new’ site or part of Vila et al.’s 2-T-62 which was described as including a number of mounds with concentrations of pottery in the surrounding area, as well as the mudbrick building. This could not be determined one way or another based on the ambiguity of the published report and so was treated as a distinct site by the DiverseNile project. However, with the re-identification of the mudbrick building, it now seems certain that AtW 001 can be treated as part of 2-T-62. Examples such as this one, demonstrate the difficulty in integrating data from the survey with any new information and how this integration can change over the course of a project.
Discussion
25The previous examples provide an overview of some of the ways in which we are using Vila’s data as part of the DiverseNile project. It should be clear from those examples that we are using and engaging with this data in a number of different ways, each of which needs to be considered separately and includes re-use, re-assessment, integration and alteration of existing data, alongside the creation of new or complementary data and reflection on our own data collection and sharing practices.
26 Re-use would include the relatively straightforward use of the 1970s data as a starting point for fieldwork and research (Huggett 2018). This is also in line with the main reason that the data were collected in the first place and published in the 1970s. However, when we re-locate these sites, we are also going a step further and re-assessing them, this includes giving them more precise locations but also confirming the type or time period of the site based on more recent research and conventions. In some cases, this includes fresh excavation which provides even more additional information. Sometimes this means altering the information provided by the original survey, for example re-dating the site (Budka 2019). 27 % of the re-located sites have been registered in the DiverseNile database with a different date than the one published by the 1970–1975 survey.
27Inevitably, as part of these processes additional data are created which can vary considerably but includes the location of additional sites which were either overlooked or not published in the 1970s. One aim of the project is to extend the survey of sites further into the desert and hinterland, particularly on the left bank, which was not done as part of the 1970s work. This will provide additional information on the region and the distribution of sites, alongside the use of new methodologies such as scientific analyses of finds or micromorphology which would not have been possible in the 1970s.
28A key issue with these different approaches and types of uses is how we integrate and combine the data that we are creating and working with (Fitton – Ulas – Lisowski et al. 2022). This integration can vary based on the scale of study; for example, evidence from a specific archaeological site or the distribution of sites in the region. A decision was also made to re-number any sites which are being investigated (e.g. excavation, geophysics) as well as surveyed to keep information between the two different types of investigation distinct (Budka 2019). 2-S-54 has been labelled as AtW 002 for the purposes of excavation for this reason, although it can obviously be combined when studying the results. Due to the occasional difficulty in re-identifying sites, this is the best way to remain consistent in our approach. Integrating the data into a project with its own aims and objectives means that we are likely biased in the type of information we are emphasising and why. Some sites will have a considerable amount of additional information, such as 2-S-54/AtW 002, while others will simply have updated coordinates.
29It is also crucial to reflect on our own collection, storage and management of data. How do we ensure that data collected by the DiverseNile project, or data which is adapted, are accessible and usable for future researchers? A key component of this is keeping track of metadata and paradata, as well as the broad context of our data collection (Hart – Barmby – LeBaur et al. 2016; White – Baldrigde – Brym et al. 2013). Nevertheless, due to the aims of the project any additional information we provide will be inconsistent and the reasons for this will need to be clearly provided in any data sharing.
30It should be highlighted here that the situation in Sudan since April 2023 should also change how we assess our use of this data. Not just in terms of the inability to access the archaeological sites that we are studying physically on the ground, but more importantly in terms of being aware of the historical and political context that our research is taking place in. Particularly, when studying data from the safety of our offices in Munich, it is crucial to acknowledge both our privileged access to this data but also the privilege of being able to do so in a safe environment compared to that of our Sudanese colleagues and the Sudanese people. As of September 2024 the death toll in Sudan has reached over 20.000 people and over 11 million people have been displaced (Al Jazeera 2024; Magdy 2024; UNHCR 2024) in a civil war that continues to be underreported.
31How the situation will affect the rest of the DiverseNile project is still unclear, and a relatively minor concern in comparison to the broader humanitarian context. However, a likely direction is one that relies increasingly on existing archaeological data, something which was already the case due to the COVID-19 pandemic at the beginning of the project. We also need to ensure that we effectively integrate these reflexive considerations into, or alongside, any data that we share at the end of the project. Given the volatile situation in Sudan currently it is difficult to determine key aspects of data management such as ensuring access for local stakeholders. This is something which will have to be considered closer to end of the DiverseNile project, recognising that the population of Sudan will have more immediate priorities.
Conclusion
32This article has considered the role of using existing archaeological data in ongoing research. The integration of this data is not always straightforward but past data remain an invaluable source of information for current and future archaeological research. When using such data it is important to be aware of their historical context, meta- and paradata, in order to understand their full potential. Re-use is too simplistic a term to apply to many of the ways with which the DiverseNile project is using the data from Vila’s survey and it is crucial for archaeological work going forward to consider the differences between these types of use, however subtle. These can have a substantial impact on archaeological interpretation going forward. The above focused on published information from the 1970–1975, in part, because of its availability online (accessible to all), we hope to also consider some of the original field records from the survey. But as these are kept in Khartoum this is not possible at the present time, and may never be based on recent reports of looting at the Sudanese National Museum (Bashir 2024; Adams 2024; Salih 2024). As the project progresses, reflecting our use of data from previous research will also allow us to consider the best ways to make any data collected during the DiverseNile project as accessible and useful as possible.
Funding Statement
33This paper was written during the ERC DiverseNile project. The European Research Council (ERC) provided funding for the project under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 865463).
Acknowledgements
34Many thanks to Panos Kratimenos and Julia Budka for commenting on initial drafts of this paper, and to the two anonymous reviews for their kind and considered comments. I would also like to thank the rest of the DiverseNile team and our friends and colleagues in Sudan without whom this research would not be possible.
Abstracts
Abstract
Between Legacy and Innovation
Archaeological Data Re-use and Integration, a Case Study from Northern Sudan
Chloë Ward
This article considers the use and potential of historical survey data through the case study of a region in northern Sudan being investigated by the DiverseNile project. The focus is on the usability and integration of archaeological legacy data in current archaeological research and methodologies. This is not always straightforward but such data remain an invaluable resource for current and future archaeological research. The article concludes with the different ways that historical survey data is being used, assessed and changed in order to be fully integrated it into a current research project and the implications that this can have.
Keywords
Sudan Archaeology, Nubia Archaeology, Legacy Data, Data Re-use
Introduction
Background
DiverseNile project
Historical Survey 1970–1975
Re-Use of Historical Data and Legacy Data
Methods
Results
Use and integration
Discussion
Conclusion
Funding Statement
Acknowledgements
Abstracts
24/7