Constraining sediment provenance for tsunami deposits using distributions of grain size and foraminifera from the Kujukuri coastline and shelf, Japan

Tsunami deposits preserved in the geological record provide a more comprehensive understanding of their patterns of frequency and intensity over longer timescales; but recognizing tsunami deposits can prove challenging due to post‐depositional changes, lack of contrast between the deposits and surrounding sedimentary layers, and differentiating between tsunami and storm deposition. Modern baseline studies address these challenges by providing insight into modern spatial distributions that can be compared with palaeotsunami deposits. This study documents the spatial fingerprint of grain size and foraminifera from Hasunuma Beach and the Kujukuri shelf to provide a basis from which tsunami deposits can be interpreted. At Hasunuma Beach, approximately 50 km east of Tokyo, the spatial distribution of three common proxies (foraminiferal taxonomy, foraminiferal taphonomy and sediment grain size) for tsunami identification were mapped and clustered using Partitioning Around Medoids cluster analysis. Partitioning Around Medoids cluster analysis objectively discriminated two coastal zones corresponding to onshore and offshore sample locations. Results show that onshore samples are characterized by coarser grain sizes (medium to coarse sand) and higher abundances of Pararotalia nipponica (27 to 63%) than offshore samples, which are characterized by finer grain sizes (fine to medium sand), lower abundances of Pararotalia nipponica (2 to 19%) and Ammonia parkinsoniana (0 to 10%), higher abundances of planktonics (15 to 58%) and species with fragile tests including Uvigerinella glabra. When compared to grain‐size and foraminiferal taxonomy, foraminiferal taphonomy; i.e. surface condition of foraminifera, a proxy not commonly used to identify tsunami deposits, was most effective in discriminating modern coastal zones (identified supratidal, intertidal and offshore environments) and determining sediment provenance for tsunami deposits at Kujukuri. This modern baseline study assists the interpretation of tsunami deposits in the geological record because it provides a basis for sediment provenance to be determined.


INTRODUCTION
The coastlines of eastern Japan have a long history of repeated large earthquakes and tsunamis (e.g. Nanayama et al., 2003;Sawai et al., 2012). Geological studies conducted in Hokkaido (Nanayama et al., 2003Sawai et al., 2004Sawai et al., , 2009b; Tohoku (Sawai et al., 2012;Tanigawa et al., 2014) and Kanto (Fujiwara & Kamataki, 2007;Shishikura, 2014) have revealed evidence of earthquakes and tsunamis that pre-date the historical record, extending the timeframe of known events back in time by up to 4000 years. These studies are used to assess the long-term seismic trends along subduction zones (e.g. Sawai et al., 2012Sawai et al., , 2015. Assessing the provenance of overwash deposits, such as those from Japan, can provide improved hazard assessment by determining the transport distance and depth from which sediments were entrained (e.g. Uchida et al., 2010;Kosciuch et al., 2018). However, assessing the provenance of anomalous sand layers preserved within coastal sediments is complicated by the presence of a mixture of marine, brackish and terrestrial sediments; an artifact of erosion, transport and deposition of tsunami waves as they inundate the coastline (e.g. Dawson et al., 1996;Grand Pre et al., 2012;Pilarczyk et al., 2014). Detailed modern distribution studies of sedimentological proxies help in this regard because they identify specific sediment sources within coastal, nearshore and offshore environments (e.g. Gischler & M€ oder, 2009;Pilarczyk et al., 2011;Kosciuch et al., 2018).
The spatial distributions of three proxies for palaeotsunami identification were examined and mapped at Hasunuma Beach in central Kujukuri (Fig. 1); foraminiferal taxonomy, foraminiferal taphonomy (i.e. surface character of individual tests) and sediment grain size. Although a multiproxy approach is necessary to properly assess overwash deposits, determining which proxies are most useful for assessing sediment provenance is of particular importance because although this region was inundated by the 2011 Tohoku tsunami (Goto et al., 2012), little is known about the long-term patterns of tsunami recurrence and magnitude. The establishment of a modern baseline study for palaeotsunami recognition and sediment provenance assessment will aid in the interpretation of older tsunami deposits preserved in coastal sediments in the region around Metropolitan Tokyo (Fujiwara & Kamataki, 2007).

SITE DESCRIPTION
Hasunuma Beach is part of the Kujukuri strand plain system located on the Pacific side of Japan ca 50 km east of Tokyo ( Fig. 1A and B). The close proximity of the Kujukuri strand plain to the convergent boundaries between the Continental, Pacific Ocean and Philippine Sea plates has resulted in net tectonic uplift over the Holocene (e.g. Shishikura, 2000Shishikura, , 2001Shishikura & Miyauchi, 2001;Tamura et al., 2010). This net uplift, in combination with a continuous supply of sediment from eroding headlands, has permitted the formation of a prograding strand plain system that is 7 to 10 km wide and extends ca 50 km in a northeast/south-west direction (Shishikura, 2000). A series of parallel and subparallel sandy beach ridges (ranging in height from 2 to 10 m above TP; TP = Tokyo Peil, the mean sea-level at Tokyo Bay) and swales document a history of prograding shorelines over the last 6000 years (Moriwaki, 1979;Masuda et al., 2001;Tamura et al., 2007) to a maximum of 10 km in a seaward direction (Sunamura & Horikawa, 1977).
The landward limit of the Kujukuri strand plain is bounded by marine terraces composed of Pleistocene marine deposits and loam (Marine Isotope Stage 5e) that reach up to 130 m above TP in elevation (Tamura et al., 2010). The seaward limit is characterized by a series of continuous sandy beaches (ca 100 to 200 m in width) along a microtidal (mean high tide level of 1Á07 m above TP) coastline. The shelf offshore Kujukuri is characterized by a thin layer of Holocene sand and outcrop (Nishida et al., 2019), leaving eroding headlands (e.g. Byobugaura & Taitosaki coastal cliffs) as the main sediment input to the coastal zone, with river discharge contributing a minor amount (Uda, 1989).
The coastal environment is divided into four broad zones according to elevation, distance from the shoreline and geomorphic features: (i) a shallow sloping sediment starved shelf (Kujukuri shelf) reaching depths up to 200 m; (ii) a high-energy, wave-dominated swash zone; (iii) a gently sloping beach face consisting of both foreshore and backshore zones (Tamura et al., 2008); and (iv) a sparsely vegetated dune that transitions into a stable dune system characterized by shrubs, trees and tall grasses (Fig. 1C).
The position of the Kujukuri beaches relative to the open Pacific Ocean, as well as the Japan Trench and Sagami Trough, make the coastline vulnerable to inundation by tsunamis (for  Goto et al., 2012).

Sample collection and elevations
A total of 50 surface sediment samples (15 cm 3 from the upper 1 cm) were collected from two shore-perpendicular coastal transects (T1 and T2) near Hasunuma Beach, located in Central Kujukuri (Figs 1B, 2A and 3A) for grain-size and foraminiferal (taxonomy and taphonomy) analyses. Transects were positioned in areas that represent the full coastal gradient spanning the dune (D), backshore (BS), foreshore (FS), swash (S) and offshore (O) sub-environments ( Fig. 1B and C). Sediments from each of these geomorphic zones were sampled along two transects (T1 and T2). Transects T1 and T2 are shore perpendicular transects consisting of 28 (T1-1 to T1-28; Fig. 2) and 22 samples (T2-1 to T2-22; Fig. 3), respectively (Tables S1 to S4). A topographic survey was conducted with a Virtual Reference Station (VRS) using the Real Time Kinematic Global Positioning System (RTK-GPS) GS10 (Leica Geosystems Company Limited, Tokyo, Japan) with elevations tied to Tokyo Peil (m above TP). The survey was conducted at each sample location within the dune, backshore, foreshore and swash subenvironments; whereas a Crescent A100, VS110 GPS Compass (Hemisphere GPS Inc., Calgary, Canada), and Echo sounders PDR-1300 and PDR-130 (Senbondenki Company Limited, Tokyo, Japan) and JFE 380 (Japan Radio Company Limited, Tokyo, Japan) were used to measure the position and depth of all offshore samples.
The offshore samples span the entire Kujukuri shelf and were collected in 2014 and 2015 using a Smith-McIntyre grab sampler (upper 0 to 5 cm of surface sediment) deployed from a boat. Dune, backshore, foreshore and swash samples were collected in 2013 by means of a walking survey. Coastal transects were used to target the likely sources of overwash sediment for the purpose of establishing a detailed modern distribution that can be used for comparison with the palaeorecord by future studies. Sediment samples (15 cm 3 from the upper 1 cm) from each of the sub-environments were described in terms of grain-size and foraminiferal (taxonomy and taphonomy) content. Grain-size and foraminiferal data were then clustered using Partitioning Around Medoids (PAM) cluster analysis to discriminate biofacies, taphofacies, lithofacies and combination-facies.

Grain-size analysis
Grain-size analysis was conducted on 50 surface samples using a Camsizer (Retsch Technology GmbH, Haan, Germany; measuring grain sizes between 1Á6 lm and 3200 lm). Prior to analysis on the Camsizer, organics and carbonates were removed using 30% hydrogen peroxide and 10% hydrochloric acid and dried at 50°C for 72 h (Matsumoto et al., 2016). The average of three replicates was determined before converting grain-size values to the Wentworth-Phi Scale (Krumbein, 1934). Grain-size values and abundances for each transect were interpolated and gridded using a Triangular Irregular Network (TIN) algorithm according to Sambridge et al. (1995) and plotted as Particle-Size Distribution (PSD) plots in Geosoft Oasis TM (Figs 2B and 3B;Donato et al., 2009). Grain-size descriptions follow that of Blott & Pye (2001). Statistical parameters including: mean (average grain size), mode (dominant grain size), standard deviation (degree of sorting), skewness (degree of symmetry of the grain-size distribution), kurtosis (peakedness of a grain-size distribution), and d10, d50 and d90 (grain size at which 10%, 50% or 90% of a sample's volume is occupied by smaller grains) were used to characterize each sample and are listed in Tables S1 and S2.

Foraminiferal analysis
Thirty-three surface sediment samples for foraminiferal analysis were stained with Rose Bengal and stored in buffered ethanol in the field immediately following collection to identify living versus dead individuals (Figs 2C and 3C;Walton, 1952;Murray & Bowser, 2000). An additional 17 unstained surface sediment samples were used where only the total assemblage was considered. Prior to analysis, 10 cm 3 sediment samples were washed over a 63 lm sieve and the sample was wet split to obtain counts of ca 300 specimens (Scott & Hermelin, 1993). The  total number of foraminifera (living and dead) contained within 10 cm 3 was calculated for each sample site (Figs 2C and 3C; Tables S3 and S4). Foraminiferal taxonomy followed Loeblich & Tappan (1987) and Uchida et al. (2010), and identifications were confirmed using the type specimens at the Smithsonian Institution in Washington DC. Figure 4 shows scanning electron microscope (SEM) images obtained using a Hitachi TM3030 tabletop microscope (Hitachi High-Tech, Fukoka, Japan) of the dominant foraminiferal taxa identified in this study. Foraminifera were categorized using the same taphonomic criteria defined by Pilarczyk et al. (2011) which includes: unaltered, fragmented and corraded (combined influence of corrosion and abrasion) individuals. Broken specimens with angular edges were classified as fragmented, whereas broken specimens with rounded edges were classified as corraded. The taphonomic (or surface) condition of individual foraminifera has previously been used to interpret overwash deposits because it is an indicator of the origin and transport history of the sediment (e.g. Goff et al., 2011;Pilarczyk et al., 2012;Hoffmann et al., 2018). Elevation or depth (m above TP), total foraminifera in 10 cm 3 , abundance (%) of dead (unstained) individuals (where available), dominant species and taphonomic character were plotted against increasing distance from the shoreline for both transects (Figs 2 and 3). Foraminiferal results are listed in Tables S3 and S4.

Cluster analysis
Partitioning Around Medoids cluster analysis was used to objectively determine biofacies, taphofacies, lithofacies and combination-facies that were then compared with observed geomorphic beach zones along transects T1 and T2. Prior to cluster analysis taxonomic, taphonomic and grain-size values were standardized; first, raw counts (taxonomic and taphonomic data) and measurements (grain-size data) were converted to abundances (%), and then Z-scores were calculated and used for clustering. The Zscores are a means of standardizing datasets by assessing how many standard deviations a value is from the mean (Davis & Sampson, 2002). The PAM cluster analysis (Kaufman & Rousseeuw, 1990) was then performed following the methods of Kemp et al. (2012), where the 'cluster' package in R (Maechler et al., 2005) was used to identify zonation within the Kujukuri coastal transect based on combinations of foraminiferal taxonomy (total assemblage), taphonomy and grain-size datasets.
Partitioning Around Medoids (PAM) generates silhouette plots with widths ranging from À1 to 1. Silhouettes are an estimate of a sample's classification where values close to À1 are those that are incorrectly classified; whereas values close to 1 indicate that the sample was assigned to the appropriate cluster. The maximum average silhouette width was used to objectively determine the number of clusters within each dataset combination of this study (for example, tests 1 to 7; see below). Methods for determining silhouette width are detailed in Kaufman & Rousseeuw (1990).
In order to determine which datasets clustered most effectively, seven combinations of data were clustered using % abundances: test 1 (taxonomy); test 2 (taphonomy); test 3 (grain size); test 4 (taxonomy and taphonomy); test 5 (taxonomy, taphonomy and grain size); test 6 (taxonomy and grain size); and test 7 (taphonomy and grain size). These tests were performed on T1 (Figs S1 to S3) and T2 (Figs S4 to S6) separately to determine the spatial variability at Hasunuma Beach, and then the datasets were combined (T1 and T2; Figs 5 and S7 to S10) to determine regional variability.

Grain size
Transect 1 (T1) Average (mean) grain size. Transect T1 spanned a distance of 47 km from the coastal dunes of Hasunuma Beach (3Á4 m above TP) to the seaward edge of the Kujukuri shelf (À123 m above TP). The mean grain size along this transect ranged from fine to coarse sand, with all but two samples (T1-15 and T1-28) in the fine to medium sand range ( Fig. 2B; Table S1). In general, the mean grain size became finer with depth, with sediments from the dune (average mean = 1Á98 Φ), backshore (average mean = 1Á94 Φ) and foreshore (average mean = 1Á92 Φ) being slightly coarser than those from the offshore (average mean = 2Á50 Φ). The one exception to this seaward fining trend was the swash zone, where the average mean grain size was 1Á45 Φ. Within the offshore zone, sediments became progressively finer with increased distance from the shoreline up to a distance of 23 km (for example, 2Á70 Φ at T1-16 versus 2Á97 at T1-22), at which point the sediments became progressively coarser (for example, 2Á81 Φ at T1-23 versus 0Á155 Φ at T1-28).
Dominant grain size (mode). The average mode, or dominant grain size within a given distribution, became finer with increasing distance inland from the swash (for example, average mode in dune samples = 2Á05 Φ versus 1Á76 Φ in the swash). In general, the finest mode values were found in the offshore samples, Fragmented Increasingly corraded which showed progressive fining up to a distance of 23 km from the shoreline (for example, 2Á62 Φ at T1-16 versus 3Á12 Φ at T1-22), at which point the mode became progressively coarser (2Á88 Φ at T1-23 versus 1Á12 Φ at T1-28).

Unaltered Corraded
Degree of sorting. Aside from the two coarsest samples, T1-15 and T1-28, which were poorly sorted (1Á08 Φ and 1Á25 Φ, respectively), sediment along T1 was moderately to very wellsorted (0Á81 Φ to 0Á32 Φ; Table S1). In general, the degree of sorting increased in a seaward direction from the dune (moderately well-sorted; average SD = 0Á60 Φ) to the offshore (wellsorted; average SD = 0Á45 Φ). However, with an average degree of sorting of 0Á80 Φ (moderately sorted), the swash was the least sorted zone.
Skewness and kurtosis. In general, most samples from T1 had a symmetrical grain-size distribution and mesokurtic kurtosis (sediment profile curve with a normal distribution). Coarsely skewed (sediment profile curve where the coarse grains dominate) samples were generally limited to the offshore zone, with T1-11 in the foreshore and T1-4 in the backshore also exhibiting coarse skewness (À0Á11 Φ and À0Á12 Φ, respectively). The only finely skewed (sediment profile curve where the fine grains dominate) sample was T1-15 collected from the swash zone (0Á10 Φ). Platykurtic (a sediment profile curve that is wider around the mean and has a thin tail) samples were exclusive to the foreshore and swash zones, where sediments were generally the coarsest (mean grain size = 1Á80 to 1Á57 Φ) and the least sorted (mean SD = 0Á81 to 0Á65 Φ). The one exception to this pattern was T1-28, a poorly sorted coarse sand with a platykurtic kurtosis of 0Á81 Φ. By contrast, samples with leptokurtic kurtosis (sediment profile curve that is more clustered around the mean and has fatter tails) were only found in the offshore zone, with seven of the 13 samples collected from the offshore containing this type of kurtosis.
Transect 2 (T2) Average (mean) grain size. Transect T2 spanned a distance of 41 km from the coastal dunes of Hasunuma Beach (3Á6 m above TP) to the seaward edge of the Kujukuri shelf (À124 m above TP). The mean grain size along this transect ranged from fine to coarse sand, with all but one sample (T2-21) in the fine to medium sand range ( Fig. 2B; Table S2). Similar to T1, the mean grain size along T2 became finer with depth. Sediments from the dune (average mean = 1Á96 Φ), backshore (average mean = 1Á76 Φ), and foreshore (average mean = 1Á89 Φ) were slightly coarser than those from the offshore (average mean = 2Á32 Φ). The coarsest samples were from the swash zone, where the average mean grain size was 1Á34 Φ). Within the offshore zone, sediments were generally within range of fine sand (2Á80 to 1Á95 Φ) until a distance of 40 km, at which point the sediments increased in grain size (for example, 2Á63 Φ at T2-20 versus 0Á87 Φ at T2-21).
Dominant grain size (mode). The dominant grain size along T2 did not show a fining trend as strong with increasing distance in a seaward direction as T1. Although the coarsest modes were found in the swash zone (average mode = 1Á15 Φ), the foreshore samples were generally finer (average mode = 2Á05 Φ) than the dune samples (average mode = 1Á96 Φ). Similar to T1, the finest mode values were found in the offshore samples up to a distance of 34 km (2Á87 to 2Á37 Φ), at which point the mode became increasingly coarser (2Á37 Φ at T2-20 versus 1Á62 Φ at T2-22).

Foraminifera
Transect 1 (T1) Foraminifera are present in all samples retrieved from T1 ( Fig. 2; Table S3). Moving from the dune to the offshore, the total concentration of foraminifera ranged from 10 to 9952 individuals per 10 cm 3 . Concentrations of foraminifera peaked in the offshore zone (320 to 9952 individuals per 10 cm 3 ) and markedly decreased in the swash zone (382 to 956 individuals per 10 cm 3 ), foreshore (68 to 240 individuals per 10 cm 3 ), backshore (74 to 86 individuals per 10 cm 3 ) and dune (10 to 42 individuals per 10 cm 3 ).
The taphonomic condition of foraminifera contained within sediments from T1 includes fragmented, abraded and unaltered forms. Samples within the swash zone are exclusively dominated by unaltered individuals (44 to 52%); whereas samples within the foreshore, backshore and dune are generally dominated by corraded individuals (27 to 51%, 58 to 82% and 81 to 93%, respectively). By contrast, offshore samples were dominated by unaltered individuals (65 to 88%).

Cluster analysis: biofacies, taphofacies and lithofacies
Datasets from T1 (Figs S1 to S3) and T2 (Figs S4 to S6) were first considered separately to examine spatial variability within the proxy data at Hasunuma Beach. The datasets (T1 + T2) were then combined to test whether coastal zones could be discriminated (Figs S7 to S10). In the case of T1 versus T2, tests that produced the highest average silhouette width were ordered in the same way (Table 1), with test 2 (taphonomy; T1 = 0Á56, T2 = 0Á48; Figs S1bi and S4bi, respectively) and test 3 (grain size; T1 = 0Á41, T2 = 0Á40; Figs S1ci and S4ci, respectively) producing the greatest widths, and test 1 (taxonomy; T1 = 0Á16, T2 = 0Á12; Figs S1ai and S4ai, respectively) producing the smallest widths. All seven tests for each of T1 and T2 indicate that the data can be reliably divided into two to three clusters. For T2 test 1, the five-cluster scenario produced the highest average silhouette width (0Á14), but one of the clusters returned a value of 0, indicating that it was composed of a single sample (Fig. S4av). This is similar to T2 test 3, where the four-cluster scenario also retuned a 0 value for one of its clusters (Fig. S4civ).
In six of the seven tests, the highest average silhouette width was produced by the threecluster scenario (all tests except test 4). However, only in the case of test 2 (taphonomy) did the clusters make ecological sense (average silhouette width = 0Á46; 94% of samples clustered in the appropriate group). Taphonomy strengthened taxonomic data. For example, under the two-cluster scenario, taxonomy had an average silhouette width of 0Á13 and 90% of samples clustered in the appropriate group, while taxonomy and taphonomy had an average silhouette width of 0Á18 and 98% of samples clustered appropriately. Taphonomy did not improve grain size; the accuracy of clustered samples remained the same (90%) and the average silhouette width only varied by 0Á01 (Table 1). Adding all three parameters together did not improve the accuracy of appropriately clustered samples or the average silhouette width by much, except in the case of taxonomy.

Biofacies (test 1)
Group BF-1 (onshore group) has an average silhouette width of 0Á16 and is dominated by Pararotalia nipponica (9 to 63%). The elevation of this cluster ranges from À34 to 4 m above TP, and its distance from the shoreline ranges from À18 460 to 120 m. Group BF-2 (offshore group) has an average silhouette with of 0Á08. The elevation of samples in BF-2 ranges from À124 to 1Á0 m above TP, and their distance from the shoreline ranges from À46 600 to 49 m. The foraminiferal assemblage of BF-2 is characterized by low abundances of fragile tested species including: Uvigerinella glabra, Bucella frigida and A. tepida.

Taphofacies (test 2)
Group TF-1 (intertidal group) is characterized by an average silhouette width of 0Á61, an elevation of À0Á5 to 1Á0 m above TP and a distance from the shoreline of 0 to 49 m. The taphonomic assemblage of TF-1 is dominated by unaltered foraminifera (12 to 52%), followed by fragmented foraminifera (25 to 44%) and corraded foraminifera (20 to 44%). Group TF-2 (supratidal group) had an average silhouette width of 0Á53, an elevation of 0Á7 to 3Á6 m above TP, and a distance from the shoreline ranging from 40 to 120 m. Unlike TF-1, TF-2 is dominated by corraded individuals (53 to 93%), followed by those that were fragmented (7 to 32%) and unaltered (0 to 29%). Group TF-3 (offshore group) is characterized by an average silhouette width of 0Á72, an elevation of À125 to À13 m above TP, and a distance from the shoreline of À3080 to À46 600 m. Group TF-3 is dominated by Table 1. Highest average silhouette width for tests 1 to 7. Results for transect 1 (T1), transect 2 (T2) and transects 1 and 2 combined (T1 + T2) for two to five cluster-scenarios. Highest silhouette widths per cluster scenario are indicated in bold.

Lithofacies (test 3)
Group LF-1 (onshore group) has an average silhouette width of 0Á31, and samples within the cluster range in elevation from À124 to 4 m above TP, and À46 600 to 123 m in distance from the shoreline. Samples within LF-1 have a mean grain size of 0Á15 to 2Á15 ɸ (coarse to medium sand), with fine sand being the dominant grain size (d90 = 1Á61 to 2Á95 ɸ). Group LF-2 (offshore group; average silhouette width = 0Á58) is dominated by samples that are slightly finer (mean = 2Á57 to 2Á97 ɸ, d90 = 3Á05 to 3Á40 ɸ) but still in the medium sand range. Samples from LF-2 are characterized elevations that range from À87 to À13 m above TP and distances from the shoreline that range from À3080 to À33 640 m (excluding T1-12 that clustered erroneously).

Combination facies (tests 4 to 7)
Combination facies described here consist of those from test 4 (C4-1 to C4-3), test 5 (C5-1, C5-2), test 6 (C6-1, C6-2) and test 7 (C7-1, C7-2; Figs 5 and S8 to S10) and will be discussed in order of highest to lowest average silhouette width. Sample locations within C7-1 range in elevation from À124 to 4 m above TP and range between 123 m and À46 600 m from the shoreline. Unaltered foraminifera dominate the taphonomic assemblage in 45% of the samples from this cluster. The C7-1 cluster is characterized by sediments that are generally in the medium sand range (average mean grain size = 1Á68 ɸ) but range from 0Á15 (coarse sand) to 2Á15 ɸ (fine sand). The d90 of samples within this cluster remains relatively constant within the medium sand range (1Á61 to 2Á95 ɸ); however, the d10 fluctuates between very fine pebbles (À1Á37 ɸ) and medium sand (1Á56 ɸ). The C7-2 cluster contained samples that were generally from lower elevations (À13 to À87 m above TP) and greater distances from the shoreline (À33 640 to À3808 m above TP) compared to those from C7-1. The sediment grain size of C7-2 was slightly finer than that of C7-1 (average mean grain size = 2Á75 ɸ; fine sand), ranging from 2Á57 to 2Á97 ɸ (fine sand). Neither the d10 nor the d90 fluctuated beyond medium to fine (1Á92 to 2Á53 ɸ) and very fine (3Á05 to 3Á40 ɸ) sand, respectively.
The C6-1 cluster contains sample sites located between 123 m and À46 600 m from the shoreline and between À124 m and 4 m above TP in elevation. The C6-1 samples are generally dominated by P. nipponica (2 to 63%) and, in most cases, A. parkinsoniana (1 to 45%), with C. refulgens and E. crispum present in smaller abundances (<23%). The mean grain size ranges from coarse (0Á15 ɸ) to fine (2Á15 ɸ) sand and has an average mean grain size of 1Á68 ɸ (medium sand). The d90 values are predominantly within the fine sand range; however, the d10 values fluctuate between medium (1Á92 ɸ) and fine (2Á53 ɸ) sand. Cluster C6-2 is comprised of offshore samples that are generally deeper (À13 to À87 m above TP) and further from the shoreline (À3080 to À31 140 m). Planktonics (15 to 58%) dominate the foraminiferal assemblage in these samples. The C6-2 samples are finer than those that clustered in C-1; the mean grain size ranges from 2Á57 to 2Á97 ɸ fine sand.
Test 5, which clusters all three datasets, defined two clusters that can be differentiated by their elevation and distance from the shoreline. Cluster C5-1 consists of samples that are within 123 m of the shoreline and between À0Á5 to 4Á0 m above TP, except for offshore samples T1-27, T1-28 and T2-21, T2-22. The dominant species within C5-1 samples are P. nipponica (3 to 63%), A. parkinsoniana (1 to 45%) and E. crispum (up to 11%), with most foraminifera categorized as either corraded (12 to 93%) or unaltered (0 to 81%). The mean grain size ranges from coarse (0Á15 ɸ) to fine (2Á15 ɸ) sand and has an average mean grain size of 1Á64 ɸ (medium sand). The d90 values are all within the fine sand range (1Á61 to 2Á76 ɸ); however, the d10 values fluctuate between very fine pebbles (À1Á67 ɸ) and medium sand (1Á56 ɸ). The C5-2 samples were generally lower in elevation (À13 to À87 m above TP) and further from the shoreline (À3080 to À33 640 m) than those within the C5-1 cluster. Pararotalia nipponica and planktonics dominate the species assemblage, ranging from 3 to 19% and 15 to 58%, respectively. All samples that clustered in C5-2 contained a taphonomic assemblage that was dominated by unaltered individuals (64 to 88%). The sediment grain size of C5-2 was slightly finer than that from C5-1; the mean grain size ranges from medium (1Á95 ɸ) to fine (2Á97 ɸ) sand, with an average mean of 2Á71 ɸ (fine sand). Neither the d10 or the d90 fluctuated beyond medium to fine (1Á92 to 2Á53 ɸ) and very fine (3Á05 to 3Á40 ɸ) sand, respectively, with the exception of T2-15.

Modern proxy distributions
Grain-size results are distinguished between onshore and offshore samples along two transects (Fig. 5). In general, the mean grain size along T1 and T2 ranged from fine to coarse sand and became progressively finer with increasing depth (Fig. 2), a trend previously documented along the Kujukuri shelf (Nishida et al., 2019) as well as other locations (e.g. Gao & Collins, 1994). At a distance of ca 23 km offshore, the mean grain size became coarser, likely because of interaction with the Kuroshiro Current (Mizuno & White, 1983). Onshore samples were moderately well-sorted, which is to be expected because of wind-transported sediment accumulating in the dunes and sorting by waves and tides (Jiang et al., 2015). Similarly, offshore samples were generally wellsorted to very well-sorted because sedimentation at these depths is a function of finer sediment settling out of suspension.
Taxonomic assemblages of foraminifera are often employed as a means of assessing both storm and tsunami deposits on the basis that marine taxa are transported inland to locations where they normally would not occur (e.g. Mamo et al., 2009;Pilarczyk et al., 2014). Studies have also used foraminifera to infer provenance of sediments contained within overwash sediments (Nanayama & Shigeno, 2006;Lane et al., 2011;Sieh et al., 2015). Even though numerous studies have reported on the distribution of foraminifera within nearshore and offshore environments (e.g. Hayward, 1999;Berkeley et al., 2007), it is necessary to establish site-specific characteristics (Kosciuch et al., 2018). Results from the modern sediment collection at Kujukuri provide a new modern dataset and will prove useful in constraining sediment provenance for tsunami deposits because taxonomic data clustered into two distinct biofacies (BF1 and BF2) representing offshore and onshore sources of sediment.
Along the Kujukuri shelf, foraminiferal assemblages varied with distance from the shoreline (Figs 2 and 3). For example, planktonic foraminifera are present in low abundances within the swash and foreshore environments (0 to 9% in T1; 4 to 12% in T2) and peak (17 to 48% in T1; 15 to 58% in T2) in the offshore zone at distances of >30 km from the shoreline. Planktonics are notably absent in all dune samples. The presence of planktonics has previously been used to document overwash deposits (Davies et al., 2003;Hawkes et al., 2007;Uchida et al., 2010); however, the extent to which they accumulate in coastal sediments versus the extent to which they get washed in by a tsunami wave or storm surge is difficult to assess. Beach and dune sediments at Kujukuri do not contain high relative abundances of planktonic foraminifera, and therefore higher abundances within overwash deposits would indicate a contribution of offshore sediments from tens of kilometres seaward of the shoreline.
Similar to planktonics, Ammonia parkinsoniana, a foraminifer that is common in intertidal environments (Hayward et al., 2004) was abundant in supratidal (12 to 23% in T1) and intertidal (15 to 30% in T1) samples, but much less so in offshore samples (0 to 4% in T1). This is consistent with the findings of Hayward & Hollis (1994) that report the occurrence of Ammonia in brackish intertidal and shallow subtidal sediments. Although abundances of robust-tested foraminifera such as A. parkinsoniana peaked in the intertidal zone where wave energy is the highest, species with thin-walled, fragile tests were more common in the offshore. For example, Uvigerinella glabra is generally absent from samples up to a distance of 6Á6 km offshore (depth = À22 m above TP). This is consistent with a study by Pilarczyk et al. (2011) where the fragile Ammonia tepida was limited to lowenergy, protected areas within a lagoon, and the more robust-tested Ammonia parkinsoniana and Ammonia convexa were more abundant in the higher energy swash zone.
The taphonomic character of foraminifera varied predictably with distance from the shoreline. Calcareous foraminifera within the swash zone and deeper are sheltered from subaerial exposure and have the highest proportions of taphonomically unaltered individuals across all samples (for example, ca 45% unaltered in T2 swash sediments versus 0% in dune sediments). Increased residence time in the foreshore, backshore and dune environments causes foraminiferal tests to become increasingly corroded and abraded ('corraded'). For example, along T1, the average percentage of corraded individuals increased with increasing distance from the shoreline from 22% in the swash, to 41% in the foreshore, to 68% in the backshore, to 89% in the dune (Fig. 2). These findings are consistent with Pilarczyk et al. (2011) and Kosciuch et al. (2018) who report higher abundances of taphonomically unaltered foraminifera within subtidal environments relative to intertidal and subaerial sediments.
Fragmentation of foraminifera showed only a minor increase in abundance with increasing distance inland in both of the transects (Figs 2  and 3). Several studies report increased abundances of fragmented foraminifera within tsunami deposits (e.g. Kortekaas & Dawson, 2007;Chagu e-Goff et al., 2011). This suggests that fragmentation within tsunami deposits may be less reflective of the original depositional environment and more the result of overprinting by a tsunami.
Microfossil taphonomy, underutilized as a proxy in overwash studies (Mamo et al., 2009;Pilarczyk et al., 2014), showed the greatest relationship with increasing distance inland (Figs 2 and 3) and produced cluster scenarios with the highest average silhouette width. Tests that included taphonomic datasets were the only cases where separation between offshore, intertidal and supratidal zones could be made (tests 2 and 4; Fig. 5). This is similar to Kosciuch et al. (2018) who report that taphonomic datasets enhanced the utility of taxonomic datasets in a tropical reef flat environment. Studies by Goodman-Tchernov et al. (2016) and Hoffmann et al. (2018) used a subset of taphonomic characters of individual foraminifera in combination with archaeological and sedimentological evidence to document palaeotsunami deposits. Usami et al. (2017) used the presence of wellpreserved tests from thin-walled (delicate) species to assess sediment transport by a turbidity current generated by the 2011 Tohoku tsunami. Collectively, these studies highlight the power of foraminiferal taphonomy; however, it is important to note that taphonomy should not be used as stand-alone technique. Rather, taphonomy is best used in combination with other proxies, such as taxonomy, where it can strengthen interpretations and provide ecological context.

Modern baseline studies and sediment provenance
Sediment provenance assessment is integral to the proper interpretation of overwash deposits and is particularly important when: (i) distinguishing overwash deposits from overlying or underlying sediments that are similar in composition (e.g. Pilarczyk et al., 2011;Kelsey et al., 2015;Kosciuch et al., 2018); (ii) attempting to assess the magnitude of an event (Uchida et al., 2010;Sieh et al., 2015); and (iii) distinguishing between different types of events (for example, storm versus tsunami; Donato et al., 2008;Switzer & Jones, 2008). Sedimentary and micropalaeontological proxies have previously been used to assess sediment provenance and interpret tsunami deposits (Hemphill-Haley, 1995;Hawkes et al., 2007;Sawai et al., 2009a,b;Chagu e-Goff et al., 2011); however, a quantitative means of determining which proxies are most effective in constraining sediment provenance is rarely used. This study provides a new quantitative dataset of modern coastal and offshore sediments and foraminifera from Japan for the purpose of identifying and inferring the sediment provenance of tsunami deposits in the palaeorecord. The 2011 Tohoku tsunami impacted the Kujukuri shelf and coastline and may have resulted in persistent homogenization of the surface sediment. However, for the following reasons, this study assumed that the collected sediments indeed do represent normal background conditions: (i) the tsunami at this location resulted in <2 m of run up (Mori et al., 2012); (ii) surface samples in this study were collected five years after the tsunami; (iii) sampling was restricted to the upper 1 cm of sediment; and (iv) clear zonations were evident, indicating a lack of homogenization by tsunami disturbance.
Studies of the rates of recovery of benthic faunal communities after major tsunami events have shown that in shallow marine environments, the complete recovery of the benthic community is possible after five days (Altaff et al., 2005), within 50 days (Szczucinski et al., 2006) or within several months (Yamada et al., 2014). However, deeper continental slope benthic communities off south-east India had not fully recovered within 18 months of the 2004 Indian Ocean Tsunami (Khan et al., 2018). Thus, the assumption that the Kujukuri shelf and coastline returned to pre-tsunami conditions is a likely one given the five-year difference between the tsunami and the collection of the surface samples analyzed in this study.
Using Partitioning Around Medoids (PAM) cluster analysis, two coastal zones (onshore and offshore) were objectively discriminated, and it was quantitatively determined which proxy datasets are most effective in constraining sediment provenance within these zones and which proxies will be most useful in documenting tsunami deposits at Kujukuri. The findings of this study show that, when compared with grain-size (average silhouette width = 0Á40) and taxonomic (average silhouette width = 0Á13) datasets, the surface condition of individual foraminifera (taphonomy; average silhouette width = 0Á53 for a two-cluster scenario, 0Á63 for a three-cluster scenario) was the best indicator of sediment provenance ( Fig. 5; Table 1).
Foraminiferal taphonomy (test 2) was the only test where a three-cluster scenario produced the highest silhouette width (average silhouette width = 0Á63) and made ecological sense, separating the sample sites into supratidal (86% of samples clustered appropriately), intertidal (93% of samples clustered appropriately) and offshore (100% of samples clustered appropriately) zones (Fig. 5). Although the highest average silhouette widths for tests 1, 3 and 5 to 7 were produced under a three-cluster scenario, the clusters contained a mixture of samples from multiple zones and did not make ecological sense. For example, in a three-cluster scenario, taxonomic data assigned samples to the supratidal, intertidal and offshore zones with an accuracy of 50%, 40% and 100%, respectively. Accuracy was improved when the two-cluster scenario was used; 84% of samples were appropriately clustered in the onshore zone, while 94% of samples clustered appropriately in the offshore zone. Kosciuch et al. (2018) used PAM cluster analysis to identify six zones in a modern carbonate reef environment from Vanuatu. Similar to the samples collected from Kujukuri, Kosciuch et al. (2018) found that taphonomic data produced clusters with a higher average silhouette width (0Á46) than taxonomic data (0Á28). However, unlike the present study, samples from the carbonate reef only clustered with 66% (taxonomy) and 64% (taphonomy) accuracy. The study from Vanuatu concluded that the combined use of taxonomy and taphonomy produced a cluster scenario with the highest percentage of appropriately clustered samples (94%). At Kujukuri, the combined use of taxonomy and taphonomy produced a three-cluster scenario where 73%, 85% and 100% of samples clustered appropriately (average accuracy = 88%) in the supratidal, intertidal and offshore zones, respectively ( Fig. 5; test 4).

CONCLUSION
Detailed modern distributions of sedimentological proxies such as grain size, foraminiferal taxonomy and taphonomy will aid in the identification of tsunami deposits that are preserved in coastal sediments along eastern Japan because they identify sediment sources within coastal, nearshore and offshore environments. The identification of these sediment sources in the modern environment is important in assessing the provenance of tsunami deposits because they provide insight into how far and from what depth the sediments were transported by a tsunami wave.
Although a multi-proxy approach is necessary to properly assess tsunami deposits in the geological record, results from this study show that taphonomy, which is not commonly used in overwash studies, will be the most useful parameter for constraining sediment provenance and aiding in the interpretation of additional anomalous sand layers at this site, as well as other temperate locations. On the contrary, foraminiferal taxonomy, a common proxy used to assess overwash deposits, proved to be the least effective parameter in distinguishing offshore and onshore zones when used in isolation of other proxies. The results of this study show that the best combination of parameters for identifying the sediment source of tsunami deposits at Kujukuri will be taxonomy and taphonomy, while other studies document its use in discriminating tsunami deposits in the geological record. For these reasons, it is recommended that taphonomic analysis be applied when assessing foraminiferal taxa in overwash deposits. width for 2, 3, 4 and 5 clusters. Highest average silhouette width (indicating the strongest structure) for each test is indicated by a dashed line. Silhouette plots of taxonomic (test 1), taphonomic (test 2) and grain-size (test 3) datasets divided into 2 (ii), 3 (iii), 4 (iv) and 5 (v) clusters. Grey and white bars differentiate between clusters. The average silhouette width is indicated by a dashed line. Figure S2. Results of PAM cluster analysis for T1 showing tests 4 to 6 (d) to (f). (i) Average silhouette width for 2, 3, 4 and 5 clusters. Highest average silhouette width (indicating the strongest structure) for each test is indicated by a dashed line. Silhouette plots of taxonomic + taphonomic (test 4), taxonomic + taphonomic + grain size (test 5), taxonomic + grain-size (test 6) datasets divided into 2 (ii), 3 (iii), 4 (iv) and 5 (v) clusters. Grey and white bars differentiate between clusters. The average silhouette width is indicated by a dashed line. Figure S3. Results of PAM cluster analysis for T1 showing test 7 (g). (i) Average silhouette width for 2, 3, 4 and 5 clusters. Highest average silhouette width (indicating the strongest structure) for each test is indicated by a dashed line. Silhouette plots of taphonomic + grain-size (test 7) datasets divided into 2 (ii), 3 (iii), 4 (iv) and 5 (v) clusters. Grey and white bars differentiate between clusters. The average silhouette width is indicated by a dashed line. Figure S4. Results of PAM cluster analysis for T2 showing tests 1 to 3 (a) to (c). (i) Average silhouette width for 2, 3, 4 and 5 clusters. Highest average silhouette width (indicating the strongest structure) for each test is indicated by a dashed line. Silhouette plots of taxonomic (test 1), taphonomic (test 2) and grain-size (test 3) datasets divided into 2 (ii), 3 (iii), 4 (iv) and 5 (v) clusters. Grey and white bars differentiate between clusters. The average silhouette width is indicated by a dashed line. Figure S5. Results of PAM cluster analysis for T2 showing tests 4 to 6 (d) to (f). (i) Average silhouette width for 2, 3, 4 and 5 clusters. Highest average silhouette width (indicating the strongest structure) for each test is indicated by a dashed line. Silhouette plots of taxonomic + taphonomic (test 4), taxonomic + taphonomic + grain-size (test 5), taxonomic + grain-size (test 6) datasets divided into 2 (ii), 3 (iii), 4 (iv) and 5 (v) clusters. Grey and white bars differentiate between clusters. The average silhouette width is indicated by a dashed line. Figure S6. Results of PAM cluster analysis for T2 showing test 7 (g). (i) Average silhouette width for 2, 3, 4 and 5 clusters. Highest average silhouette width (indicating the strongest structure) for each test is indicated by a dashed line. Silhouette plots of taphonomic + grain-size (test 7) datasets divided into 2 (ii), 3 (iii), 4 (iv) and 5 (v) clusters. Grey and white bars differentiate between clusters. The average silhouette width is indicated by a dashed line. Figure S7. Results of PAM cluster analysis for T1 + T2 showing tests 1 to 2 (a) to (b). (i) Average silhouette width for 2, 3, 4 and 5 clusters. Highest average silhouette width (indicating the strongest structure) for each test is indicated by a dashed line. Silhouette plots of taxonomic (test 1), and taphonomic (test 2) datasets divided into 2 (ii), 3 (iii), 4 (iv) and 5 (v) clusters. Grey and white bars differentiate between clusters. The average silhouette width is indicated by a dashed line. Figure S8. Results of PAM cluster analysis for T1 + T2 showing tests 3 to 4 (c) to (d). (i) Average silhouette width for 2, 3, 4 and 5 clusters. Highest average silhouette width (indicating the strongest structure) for each test is indicated by a dashed line. Silhouette plots of grain-size (test 3) and taxonomic + taphonomic (test 4) datasets divided into 2 (ii), 3 (iii), 4 (iv) and 5 (v) clusters. Grey and white bars differentiate between clusters. The average silhouette width is indicated by a dashed line. Figure S9. Results of PAM cluster analysis for T1 + T2 showing tests 5 to 6 (e) to (f). (i) Average silhouette width for 2, 3, 4 and 5 clusters. Highest average silhouette width (indicating the strongest structure) for each test is indicated by a dashed line. Silhouette plots of taxonomic + taphonomic + grainsize (test 5) and taxonomic + grain-size (test 6) datasets divided into 2 (ii), 3 (iii), 4 (iv) and 5 (v) clusters. Grey and white bars differentiate between clusters. The average silhouette width is indicated by a dashed line. Figure S10. Results of PAM cluster analysis for T1 + T2 showing test 7 (g). (i) Average silhouette width for 2, 3, 4 and 5 clusters. Highest average silhouette width (indicating the strongest structure) for each test is indicated by a dashed line. Silhouette plots of taphonomic + grain-size (test 7) datasets divided into 2 (ii), 3 (iii), 4 (iv) and 5 (v) clusters. Grey and white bars differentiate between clusters. The average silhouette width is indicated by a dashed line. Table S1. Grain-size statistics for T1. Table S2. Grain-size statistics for T2. Table S3. Foraminiferal taxonomic and taphonomic abundance data for T1 samples. Beach zone, distance from shoreline (m) and elevation (m above TP) are indicated. Table S4. Foraminiferal taxonomic and taphonomic abundance data for T2 samples. Beach zone, distance from shoreline (m) and elevation (m above TP) are indicated.