A dataset of sandstone detrital composition from Qinghai‐Tibet Plateau

As a hot topic in Earth sciences, the Qinghai‐Tibet Plateau has accumulated a large amount of sedimentary‐related data. We constructed a dataset of detrital components for Qinghai‐Tibet Plateau from 63 peer‐reviewed publications. The dataset thus comprises 1813 Late Proterozoic to Pleistocene sandstones from 84 stratigraphic units. For each sample, we present details on reference, detrital composition, GPS, geographic location, depositional age, tectonic setting and depositional environment. It becomes a high‐quality dataset after the information on each sandstone sample was standardized and reviewed by sedimentary experts. The dataset can be used for regional geoscience studies, exploring the general laws of the source‐to‐sink process. The dataset may also be useful in the field of utilities, such as assisting in finding suitable building stones, helping oil and gas and mineral exploration, and so forth.

Asian monsoon, global climate change and the regional distribution of living species (An et al., 2001;Clift, 2006;Deng et al., 2011;Dupont-Nivet et al., 2007;Lai et al., 2022;Molnar et al., 1993).The sandstones from Qinghai-Tibet Plateau record lots of important geological processes, such as the India-Asia collision, Lhasa-Qiangtang collision, the evolution of Tethys Ocean, the demise of the Bangong-Nujiang Ocean, ophiolite obduction, the erosion of suture zone, uplift of the plateau and Himalayas, and so forth (An et al., 2017(An et al., , 2021;;Hu et al., 2016;Lai et al., 2017;Lai, Hu, Garzanti, Sun, et al., 2019;Lai, Hu, Garzanti, Xu, et al., 2019;Ma et al., 2020;Najman, 2006).With the increasing number of tectonic sedimentology studies on the Qinghai-Tibet Plateau, a large amount of sedimentologyrelated data have been accumulated, making it possible to collect data on detrital components into a dataset.Scientific research on sedimentary general laws based on big data becomes possible.
Here, we present and describe a dataset of sandstone detrital composition from the Qinghai-Tibet Plateau.The geographic extent of the dataset is 16.4-38.7°Nand 73-103°E, including the Nepal, Pakistan, western China, Northeastern India, and a part of Myanmar and Vietnam (Figure 1).

| DATA DESCRIPTION AND DEVELOPMENT
Data were compiled from published scientific papers and theses.In order to effectively integrate different types of sandstone detritus component data, a unified standard and information entry format is necessary.Considering the amount of limited data of the Qinghai-Tibet region and the convenience of reusing, this dataset is designed as a simple Excel table, including a metadata sheet and an annotated sheet.

| Standardized header of metadata sheet
This dataset includes four major parts, including reference information, geographic location, geological background, and type and content of detrital fragments.The reference information part includes First_Author, Year, Journal, Vol, Pages, Title and WebLink.The geographic information part includes Country, State/Provenance, Region, Geological Locality, Latitude and Longitude.The geological information part includes Continent, Tectonic_unit, Group, Formation, Member, Sedimentary_Profile, Epoch, Max_Age, Mean_Age, Age_Method, Environment and Sample_ID.The type and content of detrital fragments as the main part includes 32 columns such as Qm, Qp, Pl, Kf, Lv, Lu, Lm, Lsd, Lsc, Lch, L, HM and so forth.The description and relationship among the detrital fragment codes are shown in Table 1.

| Dataset construction process
The process of building the dataset of detrital composition is as follows: 1.A first round of searching geosciences papers by "Tibet or Himalaya" as the keyword to search related geosciences papers, followed by using the keyword "point counting or Gazzi-Dickinson" for further screening.

| Supplementation and calibration of sedimentary environment and spatiotemporal information
In order to provide complete temporal and environmental attributes for each rock sample data, this dataset was constructed by complementing and calibrating GPS, epoch and depositional environment as much as possible.GPS information is a digital form of accurate spatial location and a necessary element for future mapping.Although some of the GPS data will be provided in the main text or in the attached table, there are still >40% of rock samples lacking accurate GPS.We read the corresponding GPS value from the geological maps or the lithology columns in the paper.If the GPS values were still not available from the above, we passively populated the GPS of the region or locality described in the text.
T A B L E 1 Fragment codes for framework composition of sandstones (modified after Ingersoll et al., 1984) Fragment The sedimentary environment and depositional age in the geological information section can generally be found in the main text, but >30% of the samples were still lacked sedimentary age or sedimentary environment information.We referred to a review of stratigraphic or sedimentary studies based on the study area and stratigraphic unit (Hu et al., 2022;Lai et al., 2020;Xue et al., 2023), and then supplement this part of the necessary information based on the descriptions of these relevant literatures.Despite the above efforts, there are still 41 sandstone samples lacking sedimentary environment information.These rock samples with missing depositional environments and ages are pending publication of subsequent research and future updates to the dataset.
The information of tectonic units is determined with reference to the unified tectonic unit division standard of the Qinghai-Tibet Plateau (Pan et al., 2009;Zhu et al., 2013).The supplementation and calibration of the tectonic unit were based on the GPS projection of the samples.
The review and standardization of geological information in the dataset in terms of spatial and temporal, environmental and tectonic attribution, and the examination of descriptions were carried out by the coauthors of this data and by sedimentologists familiar with the geology of the area or who have done relevant research, based on original geological evidence in the form of lithological columns, field photographs and characterization descriptions provided in the original or other associated papers.

| Dataset location and format
The dataset is available at DDE (https://repos itory.deeptime.org/).It consists of a metadata Excel table and .kmlfile.Each row of the main table of the data table records all the information of each sandstone.

| Update of dataset
Under the unified management of DDE, this dataset will be regularly updated every 2-3 years to supplement newly published or newly discovered literature data in the Qinghai-Tibet region.For the stratigraphic units in the existing dataset, if more accurate results such as age and depositional environment are published in the future, the corresponding part of the information will also be updated or corrected.

| POTENTIAL DATASET USE AND REUSE
There multiple data interfaces in our dataset, which can be well correlated with different fields of geoscience research or datasets.The compatibility of correlations will be conducive to the popular use of the dataset.For example, the geographic locations or GPS included can be effectively linked to tectonic studies or social applications.Information such as stratigraphic units or epochs can be easily linked to other geological studies.
Diverse depositional environments, provenance features and tectonic backgrounds are shown in this dataset.The tectonic backgrounds from this dataset cover intracontinental setting continental block margins, subduction zones, orogenic belts, suture zones and other vast majority of tectonic settings.These representative data have important reference and in-depth research value for important hot geoscience issues such as the general laws of sourcesink processes and the relationship between sedimentary and tectonic settings.In addition, these sedimentary debris data still have many potential regional geoscience laws and the value of disciplines to be explored.
In addition to geological research, this dataset can also be used in the field of engineering for social and people's livelihood.For example, based on detrital component or geological settings data, it can assist in finding suitable stones for aggregates required by infrastructure, helping petroleum and mineral exploration, and other social production practices.

| CONCLUSIONS
This dataset collects information such as document source, geographic location, space-time, environment and material composition of 1813 sandstones since the Proterozoic.The rock samples of the Qinghai-Xizang Detrital Component Dataset are produced in different tectonic backgrounds and diverse sedimentary environments and are a representative regional dataset in the study of source-sink systems.Our datasets have prospects for future basic research, resource exploration, infrastructure construction and so forth and are of dual value to science and society.

ACKNO WLE DGE MENTS
We are grateful to Dr. Jiangang Wang, Dr. Wei An, Dr. Gaoyuan Sun, Dr. Juan Li and Dr. Weiwei Xue for their helpful discussion and comments.We are grateful to the editorial guidance from Editor-in-chief, Dr. Jian Peng, constructive comments from Associate editor, Dr. Jim Ogg, and an anonymous reviewer.We extend our thanks to Deep-time Digital Earth international big science program (DDE) for data repository and data services.The research was funded by the National Natural Science Foundation of China Project (42050102, 4200020124) and the Jiangsu Shuangchuang (Mass Innovation and Entrepreneurship) Talent Program (JSSCBS20210014).

CONFLICT OF INTEREST
This article has earned an Open Data badge for making publicly available the digitally shareable data necessary to reproduce the reported results.The data are available at https://doi.org/10.12297/15323 34650 82717 7985.Learn more about the Open Practices badges from the Center for Open Science: https://osf.io/tvyxz/wiki.

OPEN RESEARCH STATEMENT
This article has been awarded Open Data Badge for making publicly available the digitally-shareable data necessary to reproduce the reported results.Data is available at Open Science Framework ORCID Xiumian Hu https://orcid.org/0000-0002-5401-8682 The 1813 sandstones F I G U R E 2 Stratigraphic chart of sampled formations in regions within the Qinghai-Tibet plateau (only the stratigraphic units sampled for Gazzi-Dickinson point-counting analysis are shown) F G U R E 3 Number of sandstones in each period in transition interval between periods.Q, quaternary; N, Neogene; Pg, Paleogene; K, Cretaceous; J, Jurassic; T, Triassic; P, Permian; D, Devonian; S, Silurian; Є, Cambrian; Pt3, neoproterozoic F I G U R E 4 Number of sandstones for each sedimentary environment included in the dataset the dataset were deposited from Late Proterozoic to Pleistocene, the epoch with the most sandstone samples followed by Cretaceous, Jurassic, Triassic, Paleogene and Neogene (Figure