Introduction

Walking, as a part of active transportation, is important for sustainable city design, which is one of the major Sustainable Development Goals (SDG 11) of the United Nations1,2. A shift of travel from motorized modes to non-motorized modes like walking provides several benefits such as reducing vehicular emissions and congestion3,4, promoting a healthy lifestyle, and inducing a sense of a safe and vibrant urban community5,6. The quantity and quality of walking as a travel mode in an urban environment is an important contributor to its environmental sustainability3,7, public health8,9, livability10, and equity11.

This qualification and quantification are often described by the terms ‘walkability’ and ‘walking’ respectively in the research literature6,12. ‘Walking’ refers to the actual pedestrian behavior that is often measured in terms of walking frequency or pedestrian traffic count13,14.

‘Walkability’, on the other hand, is more complicated. There is significant debate about its exact definitional scope15. It is generally understood from the context of either pedestrians’ subjective perception of the utility, safety, and comfort in taking a given route10 or the urban authorities responsible for the provision and development of such an environment. Regardless of its definition, researchers and practitioners agree that highly walkable regions are typically characterized by compactness and diversity of land use, a well-connected network of walkable paths, safe and esthetically pleasing neighborhoods, and high accessibility to places of several kinds for all sociodemographic groups16,17.

The quantification of walkability is chiefly driven by these concepts of utility, safety, and comfort. In practice, this translates to considering factors like land use density and mix, access, and connectivity5, and in some cases, esthetic and subjective experiences18. Multiple walkability indices (WIs) have been proposed for different scopes and purposes.

Subjective WIs often rely on surveys and auditing tools15,19. These are normally conducted at the microscopic scale and capture the subtle nature and necessities of walking, such as perceived safety and infrastructure quality. While local auditing and surveys play a substantial role in current policymaking and implementation practices, they are limited by their sample sizes, self-selection bias, and representativeness issues20.

In contrast, objective WIs often rely on commonly available large-scale geographic information system (GIS) data and analysis, which make them more scalable to whole cities and comparable with other regions21. In GIS-based WIs, components and facilities like the road network, land use density and diversity, and accessibility to points of interest (POIs) such as grocery stores and healthcare facilities are often included12, which are important contributors to urban transportation network resilience22. Despite usually missing key minute details about walking as in surveys and audits, GIS-based WIs are relatively objective, scalable, and easy to compute, especially with the recent rise in the availability of large-scale high-resolution geospatial information23. A recent review from Shields (2021) describes the proliferation of GIS-based WIs in the last two decades21, highlighting the need for making fewer assumptions in such metrics. Some popular approaches and indices include a buffer-based POI accessibility analysis (e.g., in8), using a ‘pedshed’ to approximate reachable areas (e.g., in24), and composite metrics modeled on multiple land use covariates such as road and intersection density12 and sprawl index25. WalkScore™ 26 and the US National Walkability Index (NWI)27 are commonly used WIs that are also based on GIS analysis, but they have several limitations like buffer-based analysis and a lack of consideration for pedestrian infrastructure.

While aggregate WIs are a good measure for comparing cities and neighborhoods, they bear several limitations12,28. In this study, we focus on two of these.

First, many existing WIs assume the importance of street connectivity rather than that of walkable paths. This is problematic because places with inadequate pedestrian infrastructure may still have high connectivity for vehicles, such as on highways, and such areas would incorrectly contribute to higher pedestrian connectivity. Similarly, some highly walkable neighborhoods such as university campuses may have sound pedestrian infrastructure that is not included if we consider the road network as a proxy of the pednet29. Understandably, this substitution is done because of a lack of high-quality data on the pedestrian infrastructure available to analysts, such as sidewalks and pedestrian crossings30,31.

To this end, researchers have proposed the concept of a ‘pedestrian network’ (henceforth referred to as ‘pednet’ in this study). It is analogous to a street network but for walkable paths (sidewalks and trails) and crossings (crosswalks and pedestrian bridges)32,33. Studies using pednets are usually small-scale and generally not directly related to assessing walkability32,34,35. An exception to this small analysis scale is the study by Zhao et al. (2020) who developed a detailed 3-dimensional pednet of Hong Kong36, though their discussion is geared towards understanding its topology as opposed to walking accessibility.

The second major limitation of major WIs is that they are generally meant for descriptive purposes rather than design purposes5,21. This means that they provide little insight in making design decisions such as deciding which sidewalk or crosswalk segment should be prioritized for improvement for maximum benefits in terms of increased connectivity, pedestrian movement, and overall ease of walking. Having a link or corridor-level WI can significantly help in transferring abstract knowledge of walkability to engineering design actions28.

This network design problem is usually approached by either deterministic traffic assignment or critical link estimation (CLE). CLE is an important concept in passenger traffic and freight flow networks37 and is considered an NP-hard problem38, meaning it is not easily scalable to large networks using analytical solutions since it involves iterating over all possible link-disrupted networks37,39. A more practical approach is to approximate link costs using Monte Carlo simulations of origin-destination (OD) pairs and shortest path computation40,41. In this approach, a suitable demand generation and distribution method is used to prepare OD samples. This is followed by computing the shortest path between each OD pair and assigning all path edges a unit flow, i.e., an observational all-or-nothing assignment. This substantially reduces the computational complexity of network design, which is relevant for pednet design as well.

In this study, we address the two major limitations described above by introducing a composite index of walking and walkability called ‘pednet score’. This index is based on a weighted pedestrian network whose segments constitute walkways—sidewalks and crosswalks, as illustrated in Fig. 1. The pednet score of each walkway is computed as the product of its distance-scaled edge betweenness with the edge weight given by its popularity determined through pedestrian trip simulation (more details in Section ‘Pednet score’). Through this method, high-volume and highly-connected walkways are identified as high-priority segments that can then be used for pedestrian network design. We apply this technique to different pednet variants in three North American cities - Austin, TX, Cambridge, MA, and Toronto, OT, in addition to their existing pednets. Then, we identify high-priority sidewalks and crosswalks, both existing as well as those whose development can lead to the greatest increase in walkability and walking.

Fig. 1: Illustration of components of the pednet and trip simulation in a neighborhood in Austin, TX.
figure 1

The existing and candidate pednet links are shown in solid and dashed lines respectively. The shortest paths of two sample OD pairs are shown. D1 is not accessible from O along the existing pednet but becomes accessible when candidate crosswalks are added (thick blue and green lines). D2 is accessible from O in both cases but its shortest path’s length reduces slightly when crosswalks are added (thick blue line) compared to its existing pednet (thick pink line).

Results

Pednet variants

In this study, we compare the link performance of three variants of a city’s pednet relative to its base (currently existing) pednet to compare the extent of walkability and the distribution of the critical links induced by those changes. These variants are described below in increasing order of connectivity and cost along with their labels.

  1. 1.

    Base: This is the pednet induced by the existing walkways and serves as the base scenario. All subsequent pednet scenarios are an improvement in service over this pednet.

  2. 2.

    +Sidewalks: In this variant, non-existing yet pre-digitized sidewalk segments available in the sidewalk geometry layer are included. These are hereby called ‘candidate’ segments as they are used for assessing whether their improvement to a state of usability would significantly improve pedestrian movement in the pednet. This information is not available for the city of Cambridge and thus this variant is not included in Cambridge’s analysis.

  3. 3.

    +Crosswalks: This does not include candidate sidewalks but includes the created candidate crosswalk segments (as described in Supplementary Section 2.3).

  4. 4.

    +Both: It includes both sidewalks and crosswalks, whether existent or candidate. This is the largest network that represents the highest connectivity for pedestrians.

The summary statistics of the pednet variants described above, such as the graph size, connectivity, and mileage, are shown in Table 1. It is clear that the total number and length of edges and connectivity increase substantially as proposed segments are added to the pednet. However, the study cities show considerably different increases in cost and connectivity.

Table 1 Summary statistics of pednet variants in the study cities, showing the number of graph components and total mileage of sidewalks and crosswalks.

In Austin, for example, with 35,985 new crosswalks totaling 400 miles (difference of variant A2 from A0), the number of connected components reduce from 11,319 to 7702 (32% reduction). With 34,306 new sidewalks totaling 3336 miles (A1), the improvement would be slightly lesser in terms of connected components (25% reduction to 8511 components).

Cambridge’s pednet, on the other hand, is already highly connected and has a limited scope of improvement of crosswalks, with only 733 candidate crosswalks totaling 8 miles (variants C0 vs. C1). Similarly, Toronto’s base pednet is also reasonably well-connected. Even though its pednet has much fewer links than in Austin, its total mileage (5063 mi) is more than Austin’s (3741 mi). This is because Austin’s sidewalk network includes shorter segments broken at the house boundary level rather than Toronto’s block-level segments.

Improvement by pednet variant

The difference in the level of connectivity of the different pednet variants results in significant variation of edge weights and consequentially pednet scores, both across the improvement type (sidewalks and/or crosswalks) and by city. Edge weights simply reflect the total number of trips (i.e., connected OD pairs in a given pednet variant) out of the one million OD pairs generated in each city (see Section ‘Pednet score’ and Supplementary Section 3 for details). In the base pednet variants, only 27% OD pairs are connected in the case of Austin (A0), whereas this figure is higher in the case of Cambridge (39%) (C0) and much higher in Toronto (80%) (T0) (see Fig. 2a).

Fig. 2: Comparison of the pedestrian trip metrics and pednet scores of the pednet variants of the three study cities.
figure 2

a Number of connected OD pairs (trips), and distributions of b path length (mi), c edge weights, and d pednet scores (excluding zero-score edges). The values are capped to upper limits for the sake of visualization consistency. Pednet variant labels are the same as in Table 1.

Austin is the most promising in terms of an increase in walking usage with the improvement of both sidewalks and crosswalks. The addition of ≈39,000 sidewalk segments totaling 3583 mi (improving from A0 to A1) results in an addition of 196,566 trips (i.e., 20% of total possible trips). Interestingly, the addition of ≈36,000 crosswalks (402 mi, in improving from A0 to A2) increases the number of trips even more by 256,443. When all possible sidewalks and crosswalks are added, it leads to a significant increase of 639,249 additional trips to a total of 909,617 (91% of all possible trips).

Toronto also shows reasonable improvement after adding sidewalk segments (803,914 in T0 to 999,241 in T1, an increase of 24%). Notably, however, it only shows marginal improvement in the trip count by adding crosswalks (803,914 in T0 to 804,258 in T2, an increase of 0.04%). Cambridge, similarly, shows limited improvement in trip count (from 393,960 to 407,357, an increase of 3.4%) as its current pednet is already substantially connected.

The high potential of improvement of sidewalks and crosswalks in Austin, and to a lesser extent, sidewalks in Toronto, is also evident in the distributions of edge weights and pednet scores. The edge weights resulting from the pedestrian traffic assignment are super-exponentially distributed (Fig. 2c), with much more high-weight edges in A3 and T3 than the other corresponding pednets. The more connected networks also exhibit progressively longer paths since better connectivity allows for that (panel B).

The distributions of the pednet scores show a considerable compound effect of connectivity and usage (panel D). The distribution of A3 is more skewed left than A0. Since walkway usage (edge weight) is inherently linked to its connectivity (dEB score) and the pednet score is effectively the product of these figures, the pednet score creates a compound effect that helps it differentiate between high and low criticality edges. Also, it is more concentrated than that of the other three networks, showing a tendency for the pednet score to become less distinguishable in highly connected regions. Lastly, the distinctive distribution of A3 compared to A1 and A2 shows the benefit of a mixed improvement approach involving adding some sidewalks and some crosswalks instead of focusing on just one component.

In Toronto, even though adding candidate sidewalks leads to a substantial increase in walking, its base pednet is connected well enough (T0) to make the pednet scores similar to when sidewalks are added (T1 and T3).

Spatial variation of pednet variants

The previous section illustrates the overall patterns of improvement in walking with the addition of walkways. However, urban planners, developers, and policymakers often need to study where such improvement needs prioritization. Taking Austin as a reference case, we show the existence of significant spatial heterogeneity in the current pednet score distribution as well as its growth with each improvement scheme. In Fig. 3, the top row shows the growth of edge weights as the networks become more connected and the regions in which that happens. Similar maps for the other two cities are shown in Supplementary Fig. 3.

Fig. 3: Visualization of edge weights and pednet scores.
figure 3

Edge weights (top row) and pednet scores (bottom row) by pednet variant in Austin (labels on the top left). Darker shades reflect more connected and used sidewalk and crosswalk segments. Neighborhood outlines are shown in light blue. Maps produced using Python.

These Neighborhood Planning Areas (NPAs) (available on the Austin city data portal) are marked with a blue outline. It can be seen that the base pednet (A0) only shows high usage in the city core, most notably in the NPA ‘Downtown’. This is understandable as the downtown region is the largest demand attractor with the most destination points (mostly offices and commercial POIs). The regions north of downtown, including the University of Texas and West University NPAs also show significant walkway usage as well as pednet score.

As the network connectivity grows outside the core of the city, e.g., in A1 where non-existent sidewalks are considered existent. This effect is particularly pronounced in the suburbs where many sidewalk segments start to facilitate trips (seen in yellow-colored regions outside the NPAs, especially in northern and western Austin).

Adding crosswalks (A2) seems to have a more drastic increase in edge weight and criticality than adding sidewalks (A1). While this is evident by the difference from A1 in the distributions of usage and connectivity metrics shown in Fig. 2, the difference is spatially not uniformly distributed. Notably, A2 shows a more drastic increase in edge weight and pednet score in the core of the city (the region within the NPAs) than in A1. On the other hand, A2 does not exhibit the same growth of walkway usage in the suburban segments (the yellow regions outside the city core in A1). This could be because, in the suburbs, one crosswalk can connect long disparate stretches of sidewalks, thereby reducing the need for adding multiple crosswalks. In terms of pednet score (bottom row of Fig. 3), it can be seen that adding crosswalks creates multiple patches of highly connected regions (as seen by clusters in red in the pednet score panel for A2) separated by relatively less connected regions.

The addition of both sidewalks and crosswalks naturally leads to the greatest connectivity and usage across all the regions. In particular, the distribution of pednet score becomes more concentrated in A3 where most walkways have a score close to 7.5, the mean. This implies that as the pednet becomes denser, the power of the pednet score in differentiating the contribution of edges reduces. Fortunately, for network design purposes, the pednet score exhibits significant discriminatory power.

Finding critical links using marginal benefit curves

In the previous section, a visual inspection of the pednet score distribution hints at its suitability for identifying the walkways that contribute the greatest to the composite of connectivity and demand-based usage. In this section, we use it to identify those critical walkways and verify their utility.

To achieve this, we create multiple pednet variants for each city and each treatment type by sequentially adding candidate walkways in descending order of the pednet scores, starting with the base scenario, up to the variants corresponding to the highest possible degree of improvement (the other variants in Table 1. The benefits in terms of an increase in the trip count and distance savings based on updated paths are then computed. To compare the results with a control group, these benefits are also computed for a set of 20 randomly chosen priority rank lists of walkways. The resulting marginal benefit curves are shown in Fig. 4. Here, the abscissas show the percent increase in the added length of sidewalks and/or crosswalks, ranging from 0% (current pednet) to 100% (maximum possible addition, corresponding to the treatment scenarios in Table 1). The median and the interquartile range of the benefits of the control group at each unit of marginal link addition (x-axis) are denoted by the dashed lines and the shaded regions respectively.

Fig. 4: Marginal benefit curves of the pedestrian network improvement schemes based on the three main pednet variants as one increases the percent maximum added length of sidewalks and/or crosswalks on top of the base pednet.
figure 4

Top row: Percent maximum possible increase in trip count. Bottom row: Percent maximum possible decrease in mean trip length. Maximum possible changes are shown in the legend. \({{\Delta }}\hat{L}\): Added walkway length, \({{\Delta }}\hat{N}\): Added trips, \({{\Delta }}\hat{D}\): Decrease in mean trip length.

Increase in walking

The top row in Fig. 4 shows the percent increase in the number of trips made possible after each marginal addition of walkways, going up to the maximum possible increases shown in Fig. 2a. The slope of the curves represents the marginal improvement of walking. All of the curves in Austin are clearly concave, highlighting the fact that the marginal improvement continuously decreases with the addition of less pednet-scoring walkways. This is in stark contrast to the curves of the control group which are generally convex and lie far below the 45-degree line. For example, the addition of top-scoring sidewalk segments adding up to 20% of the maximum possible additional sidewalk length (\({{\Delta }}\hat{L}\) = 3543 mi) leads to an improvement of ≈96% of maximum improvement of trip count (i.e., 96% of the difference between trip count in A1 and A0, ≈193,000 trips). In the case of randomly selected links, the corresponding median increase is just 5%. A similar but weaker pattern is observed in Toronto’s sidewalk improvement scheme, where the proposed pednet score-based prioritization scheme yields a concave curve resulting in 57% of the maximum possible increase in trip counts for the first 20% added link length in contrast to 18% for the control group. These observations reflect the utility of pednet score as a reasonable measure of the criticality of links for pednet design.

Adding crosswalks in increasing order of their pednet scores is not as assuring as sidewalks, though it still seems promising. While Austin shows a clear example of the potential of crosswalk addition using pednet score as a criticality measure to improve walking, the cases of Toronto and Cambridge are mixed. Though the benefit-cost curves for crosswalks in these two cities are generally concave, some crosswalk segments are not very useful, particularly some top-scoring segments in the beginning. However, it should be noted that there is a limited scope of crosswalk addition in these cities in the first place, with only 8 mi in Cambridge and 47 mi in Toronto, in contrast to 399 mi in Austin. This is also reflected in the curves of the control group which show substantial interquartile ranges of the samples, especially in Cambridge and Toronto. Once again, this reflects the state of high existing connectivity of the pednets of these two cities unlike Austin’s.

Using a mixed approach in adding sidewalks and crosswalks also shows considerable benefit. For instance, adding just 25% of the maximum total possible length of sidewalks and crosswalks combined in Austin (L = 3974 mi) yields an improvement of ≈74% of the total possible increase in trip count (i.e., 74% of T = 639,000 trips). Though this marginal benefit (74%) is lesser than both of just sidewalks (96%) and crosswalks (80%) alone, the absolute benefit is still higher, since the maximum possible increase is much higher in the case of A3 than both A1 and A2. This observation also aligns with the distribution of pednet scores in Fig. 2d where the distribution for A3 is more skewed left than in A1 and A2, meaning that more walkway segments are needed to achieve a substantial percentage of maximum possible improvement in the trip count. The benefit pattern observed in Austin, however, is negligible in the case of Toronto, where both the costs and benefits of sidewalks far outweigh those of crosswalks.

Distance savings

In addition to the improvement in walking as measured by an increased count of trips made possible, we also report the benefits in trip length savings for the pedestrians who use the current pednet. To measure this, only the OD pairs found connected in the base pednet are tracked over multiple pednet scenarios. In each sequential scenario, a marginal increase in the number of walkways in the pednet leads to a possible shortening of the shortest paths, with the greatest reduction observed in the maximum treatment scenarios (e.g., A2, C1, T3). The bottom row of Fig. 4 shows the marginal increase in the mean trip distance savings with the addition of walkways, again represented on a 0–100% scale.

The marginal benefit curves of distance are more sharply concave than those of trip count (top row). For an average pedestrian, the addition of just 10% of the maximum additional length of sidewalks in Austin reduces the trip distance by 14 m (92% of ΔDsidewalks = 15 m, shown in dashed lines). However, the same percentage of addition of crosswalks reduces it by 238 m (84% of ΔDcrosswalks = 283 m). These improvements correspond to adding 354 mi of sidewalks and just 40 mi of crosswalks. If design costs are measured only by walkway length, it implies that adding crosswalks is substantially more beneficial for reducing trip lengths compared to adding sidewalks. Note that this saving pertains to only the pedestrians who are able to reach their destination even in the base pednet rather than all the pedestrians.

The curves of Toronto and Cambridge are also promising, though the maximum possible benefit in these cities is substantially lesser compared to that in Austin. For reference, the maximum mean savings by adding crosswalks are 10 m in Cambridge and 2 m in Toronto which are in sharp contrast to Austin’s 283 m. This is also a result of the already high connectivity of these cities’ existing pednets in contrast to Austin’s.

In summary, the marginal benefit curves of all the improvement schemes in all the cities are largely concave, indicating the utility of pednet score as a reasonable measure of the criticality of pednet links (sidewalks and crosswalks) in improving walking and reducing trip lengths.

Visualization of critical links

The concavity of the marginal benefit curves in Fig. 4 is beneficial from a network design point of view. Under budget constraints of the construction or improvement of new walkways, it is beneficial to know that the top few candidate segments selected based on their pednet score yield the maximum benefit to the system by increasing the extent of walking (measured by the number of new trips made possible) and reducing trip length for pedestrians under existing demand.

For the case study of Austin, Fig. 5a shows the top 2,000 sidewalk and crosswalk segments whose inclusion leads to an increase in the trip count by 48% and 37% of the maximum possible increase respectively. For reference, the original pednet (A0) segments are shown in light gray. It can be seen that the proposed crosswalk segments are concentrated in the core of the city, as suggested in panel A2 of Fig. 3. Downtown Austin already has appropriate sidewalk and crosswalk infrastructure, though the nearer regions of Central East Austin and East Cesar Chavez can greatly benefit from adding crosswalks (Fig. 5c). Sidewalk segments, on the other hand, are more scattered throughout the city, with a particularly large cluster in the northwestern suburbs of Austin around US-183 such as Great Hills Neighborhood and Leffler Commercial (Fig. 5b). A similar pattern is observed in Toronto and Cambridge where many candidate crosswalks near the downtown region are classified as critical, though the downtown core itself is highly connected (refer to Supplementary Fig. 4). Although the pattern is common at an aggregate level, a detailed investigation of the reasons and specifications of this observation is left as an exercise for a future study.

Fig. 5: Visualization of the critical links identified using pednet score.
figure 5

a Map of 2000 proposed sidewalks and crosswalks prioritized for design in Austin based on their pednet scores. Insets show close-ups of b Northwest Austin and c the central core of Austin, which have substantial scope for improvement. Neighborhood planning area (NPA) boundaries are shown in green. Maps produced using Python.

Discussion

In this study, we introduce a network-based composite measure of walkability and walking called ‘pednet score’ that is derived from and is useful for the construction/augmentation of a city’s pedestrian network (pednet). This composite measure is defined for a pednet edge (walkway) and is measured using simulated or real link-level pedestrian traffic. The proposed pednet score is a useful tool in measuring not just connectivity-based walkability, but also the extent of walking facilitated. It is observed to correspond to link criticality where high pednet scoring edges are more critical to the network in influencing the amount of walking using them.

For different pednet variants, including the potential candidate sidewalk and crosswalk segments, their network properties and pednet score distribution provide some interesting insights and design implications concerning the value of the pednet-based walkability assessment. Marginal benefit curves developed for each variant across the three study cities—Austin, Cambridge, and Toronto, provide the following key observations:

  • With the addition of ≈34,000 sidewalk segments in Austin totaling ≈3,300 mi, the number of pedestrian trips strictly using the pednet in Austin can increase by ≈73%. Interestingly, segments selected strategically using the proposed pednet score totaling just a quarter of this total length can achieve ≈98% of this increase in trip count. In Toronto, the scope of improvement is smaller but is still substantial (≈24% with 173 mi of sidewalks).

  • In Austin, adding all candidate crosswalks to the pednet is more beneficial than adding sidewalks. The addition of ≈36,000 crosswalk segments totaling ≈400 mi can increase trip count by ≈95% while that of just the first 100 mi of high pednet score-ranking crosswalks can achieve 89% of this increase. In addition to increasing walking, this development in connectivity can also reduce the mean pedestrian trip distance of current pedestrians by 283 m (0.18 mi). The crosswalk connectivity of Toronto and Cambridge is sufficiently high to not warrant substantial investment in crosswalk improvement to promote walking.

  • A mixed-development approach where the top-scoring sidewalk and crosswalk segments are prioritized for pednet improvement can lead to a much greater benefit in some cases than developing sidewalks and crosswalks alone. This is evident in Austin where the addition of 3,974 mi of segments can increase trip count by ≈236%, in contrast to 3300 mi of only sidewalks (73%) and 400 mi of only crosswalks (95%). Similarly, the mean trip distance of current pedestrians can decrease by 340 m (0.21 mi), compared with 283 m for just crosswalks.

  • The proposed measure can distinguish between the nature of sidewalks and crosswalks. For example, in suburban regions in Austin, just a few crosswalks can connect long segments of sidewalks. The pednet scores across the pednet scenarios show the extent to which this phenomenon occurs.

Since the pednet score is a link-level walkway design metric involving hypothetical/design scenarios, it can be useful in identifying specific use areas of interest or substantial scope for improvement. In Austin, for example, the map of top-scoring 2000 proposed sidewalks (Fig. 5) detects not only the regions with higher potential of improvement near the urban core but also identifies the suburban Great Hills Neighborhood with a high potential of sidewalk improvement. When this metric is computed at a bigger scale, such as in a city’s metropolitan statistical area level, and added with other layers of analysis, such as access to POI and socioeconomic factors of neighborhoods, it can identify regions and segments with even more refined assessment criteria. This includes, for example, suburban/rural locations where accessibility by walking to important places like schools and grocery stores is poor or underserved regions in dense urban cores. In general, this metric can be used either as a standalone measure of walkability and link criticality, such as an online dashboard like the University of Minnesota’s Accessibility Observatory42, or can be used in conjunction with other existing tools of walkability analysis (e.g., in43). This tool could greatly help agencies identify future pedestrian infrastructure needs and support the budget or capital improvement plan development.

This study contributes to the literature on walkability and pedestrian network design. However, it is worth noting that it only concerns the infrastructural connectivity aspect of walkability. The consideration of other aspects of walkability, such as safety, attractiveness, comfort, accessibility to POIs and public transit nodes, and access would be a welcome addition to this study. Through the notion of a pedestrian network, gross assumptions about pedestrian movement such as accessibility using the ‘as-the-crow-flies’ distance are avoided. However, the main assumption in this study that pedestrians only move along sidewalks and intersection crosswalks is to be carefully considered, since other facilities and activities such as trails, tunnels, and jaywalking may be influential in some cases. Future studies could relax this assumption and validate the results of pednet score-based network design analysis. Moreover, it is worth noting that we are currently working on validating the simulated pedestrian demand used in this study with real-world OD demand matrices such as from surveys and microscopic mobility data to improve the analysis, especially since the choice of demand distribution plays a significant role in determining critical walkways. Finally, the cost and benefit metrics presented in this study may be extended to account for other key factors such as the differences in per-mile costs of sidewalks and crosswalks, actual demand shift from automobiles to walking, and/or emissions reduction instead of just gross trip count.

Methods

Data preparation

The overview of the data processing for this study is shown in Supplementary Figure 1. We use publicly available geographic data of the three cities from city data portals (Austin:44, Cambridge:45, Toronto:46) as well as other sources such as OpenStreetMap (OSM), the US TIGER/LINE database, and a commercial data vendor SafeGraph Inc. that provided POI data in addition to OSM. These are used to generate the base pedestrian network (‘pednet’) of the study cities. Useful attributes of the sidewalk layer table are used to label sidewalk segments as ‘existing’ or ‘candidate’ (non-existent), though this information is not available in Cambridge (see Supplementary Section 2.2 for more details).

The crosswalk layers, on the other hand, do not contain the data of non-existent segments. A pednet with only a few existing crosswalks connecting sidewalks is bound to result in a highly disconnected network, which is unrealistic since pedestrians often jaywalk (i.e., cross streets at unmarked crossings) on specific points if they do not find adequate crossing infrastructure and find adequate gaps47. To overcome this limitation, we create a set of candidate crosswalk segments at road intersections using GIS tools (as explained in Supplementary Section 2.3). This process not only increases pednet connectivity but also allows us to assess which currently non-existent crossings are utilized more.

Once the base pednet is created, we use four combinations of the two walkway segment type (sidewalk and crosswalk) and existence binary to identify allowed segments to generate four pednet variants (except in Cambridge). Then, we combine the data of buildings, POI, land use, and population distribution (from the 5-year American Community Survey of 202048). We then select 1,000,000 OD pairs within 800 m of each other to simulate walking trips in each study city, which is compliant with typical walking distances of 400–1000 m49,50. More details are presented in Supplementary Section 3.4. The trips are labeled ‘possible’ if the pednet nodes nearest to the trip’s origin and destination buildings/POIs are connected in a given pednet variant context, resulting in an all-or-nothing assignment. For possible trips, distances are computed along the pednet using Djikstra’s shortest path algorithm. The weight of each pednet edge is computed based on its popularity, given by the number of trip paths containing it. Pednet scores are then computed as described below that are then used to identify the critical links in each city.

Pednet score

The pednet score proposed in this study is a composite metric of a measure of connectivity and walkway utility. This means that for two equally important edges in terms of connectivity, the one that caters to more pedestrian demand is deemed more critical.

Utility in the context of pednets is measured by edge popularity, i.e., the edge weights obtained through trip simulation. The contribution of an edge to pednet’s connectivity is measured using a betweenness centrality measure. Betweenness is a common centrality measure that is used to rank vertices and edges by their importance concerning network connectivity51,52. We use a variant called the distance-adjusted edge betweenness (dEB) measure. For each edge, it is given by the sum of the proportion of all shortest paths passing through it for all OD pairs in the network, with each OD pair being weighted by the reciprocal of its shortest path length53. Inverse distance scaling favors shorter paths. This is desirable because distance plays a much more important role in trip-making behavior for walking trips than other travel modes54.

For an undirected graph G(V, E), dEB is given by the following expression:

$${{{\rm{dEB}}}}(e)=\mathop{\sum}\limits_{u\ne v\in V}\frac{1}{d(u,v)}\cdot \frac{| {{{\mathscr{P}}}}(u,v| e)| }{| {{{\mathscr{P}}}}(u,v)| }\ \ \forall \ e\in E$$
(1)

Here, \({{{\mathscr{P}}}}(u,v| e)\) is the set of shortest paths between vertices u and v which pass through e and \({{{\mathscr{P}}}}(u,v)\) is the overall set of all shortest paths between u and v, with their common length being d(u, v). By convention, 0/0  0. The pednet score of an edge is hereby defined as the product of the edge weight, w(e), of edge e with its dEB value, taken to a logarithmic scale to reduce the range of the resultant values:

$$\gamma (e):= {\log }_{10}[w(e)\cdot {{{\rm{dEB}}}}(e)]$$
(2)