Ćwiczenie 1 - Wprowadzenie¶

In [1]:
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import matplotlib.colors as clt
import geodatasets
/tmp/ipykernel_10242/1167241068.py:1: DeprecationWarning: 
Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd
In [2]:
#get path is imported from geodatasets library
from geodatasets import get_path
path_to_data = get_path("nybb")
gdf = gpd.read_file(path_to_data)
#gdf.head()
gdf.plot(column = 'BoroName')
Out[2]:
<Axes: >
No description has been provided for this image

Operacje na danych wektorowych¶

In [3]:
#calculate areas
gdf["area"] = gdf.area
gdf["area"]
Out[3]:
0    1.623822e+09
1    3.045214e+09
2    1.937478e+09
3    6.364712e+08
4    1.186926e+09
Name: area, dtype: float64
In [4]:
cn = gdf.centroid
cn.plot()
Out[4]:
<Axes: >
No description has been provided for this image
In [5]:
gdf["bfr"] = gdf.buffer(10000)
gdf["bfr"].plot()
Out[5]:
<Axes: >
No description has been provided for this image

Ujednolicenie układu współrzędnych¶

In [6]:
pkt = gpd.read_file("data/ny.gpkg", layer = "supermarket", driver = "GPKG")
ax = gdf["geometry"].plot()
pkt.plot(ax=ax, color="black")
Out[6]:
<Axes: >
No description has been provided for this image
In [7]:
gdf.crs
Out[7]:
<Projected CRS: EPSG:2263>
Name: NAD83 / New York Long Island (ftUS)
Axis Info [cartesian]:
- X[east]: Easting (US survey foot)
- Y[north]: Northing (US survey foot)
Area of Use:
- name: United States (USA) - New York - counties of Bronx; Kings; Nassau; New York; Queens; Richmond; Suffolk.
- bounds: (-74.26, 40.47, -71.8, 41.3)
Coordinate Operation:
- name: SPCS83 New York Long Island zone (US Survey feet)
- method: Lambert Conic Conformal (2SP)
Datum: North American Datum 1983
- Ellipsoid: GRS 1980
- Prime Meridian: Greenwich

W jakim układzie współrzędnych są dane punktowe (pkt)?

In [8]:
#assign to pkt crs from gdf
pkt = pkt.to_crs(gdf.crs)
In [9]:
ax = gdf["geometry"].plot()
pkt.plot(ax=ax, color="black", markersize = 0.5)
Out[9]:
<Axes: >
No description has been provided for this image

Złączenia przestrzenne¶

In [10]:
gdf
Out[10]:
BoroCode BoroName Shape_Leng Shape_Area geometry area bfr
0 5 Staten Island 330470.010332 1.623820e+09 MULTIPOLYGON (((970217.022 145643.332, 970227.... 1.623822e+09 POLYGON ((903234.894 123347.784, 903178.057 12...
1 4 Queens 896344.047763 3.045213e+09 MULTIPOLYGON (((1029606.077 156073.814, 102957... 3.045214e+09 POLYGON ((1066963.473 157602.686, 1067059.264 ...
2 3 Brooklyn 741080.523166 1.937479e+09 MULTIPOLYGON (((1021176.479 151374.797, 102100... 1.937478e+09 POLYGON ((962679.120 165570.385, 962651.330 16...
3 1 Manhattan 359299.096471 6.364715e+08 MULTIPOLYGON (((981219.056 188655.316, 980940.... 6.364712e+08 POLYGON ((980499.119 178448.735, 979864.868 17...
4 2 Bronx 464392.991824 1.186925e+09 MULTIPOLYGON (((1012821.806 229228.265, 101278... 1.186926e+09 POLYGON ((992724.911 240962.362, 992700.941 24...
In [11]:
pkt.head()
Out[11]:
full_id brand name geometry
0 n419360013 None Waverly Urban Market POINT (993552.584 189584.172)
1 n445184616 Stop & Shop Stop & Shop POINT (1196535.721 368631.849)
2 n502791662 None Yong Fa POINT (1032646.155 210176.107)
3 n566894297 None None POINT (1022173.126 240706.578)
4 n568230825 None Aron's Kissena Farms POINT (1035770.502 204714.347)
In [12]:
sel = gdf[["BoroCode", "BoroName", "geometry"]]
pkt_with_ny = pkt.sjoin(sel, how="inner", predicate='intersects')
pkt_with_ny.head()
Out[12]:
full_id brand name geometry index_right BoroCode BoroName
0 n419360013 None Waverly Urban Market POINT (993552.584 189584.172) 2 3 Brooklyn
2 n502791662 None Yong Fa POINT (1032646.155 210176.107) 1 4 Queens
3 n566894297 None None POINT (1022173.126 240706.578) 4 2 Bronx
4 n568230825 None Aron's Kissena Farms POINT (1035770.502 204714.347) 1 4 Queens
5 n584125647 Whole Foods Market Whole Foods Market POINT (986644.761 206981.429) 3 1 Manhattan

Punkty w poligonie¶

In [13]:
supermakets_in_boro = pkt_with_ny.groupby('BoroCode').count()
supermakets_in_boro 
Out[13]:
full_id brand name geometry index_right BoroName
BoroCode
1 223 68 216 223 223 223
2 65 21 55 65 65 65
3 231 47 220 231 231 231
4 259 65 248 259 259 259
5 1 0 1 1 1 1
In [14]:
#złączenie przestrzenne oraz obliczenie punktów w poligonie
sel = gdf[["BoroCode", "BoroName", "geometry"]]
nb_pkt = pkt.sjoin(sel, how="inner", predicate='intersects').groupby('BoroCode').count()
nb_pkt["counts"] = nb_pkt["full_id"]
nb_pkt
Out[14]:
full_id brand name geometry index_right BoroName counts
BoroCode
1 223 68 216 223 223 223 223
2 65 21 55 65 65 65 65
3 231 47 220 231 231 231 231
4 259 65 248 259 259 259 259
5 1 0 1 1 1 1 1

Złączenie na podstawie atrybutów¶

In [15]:
gdf2 = gdf
gdf2.set_index("BoroCode", drop=True)
Out[15]:
BoroName Shape_Leng Shape_Area geometry area bfr
BoroCode
5 Staten Island 330470.010332 1.623820e+09 MULTIPOLYGON (((970217.022 145643.332, 970227.... 1.623822e+09 POLYGON ((903234.894 123347.784, 903178.057 12...
4 Queens 896344.047763 3.045213e+09 MULTIPOLYGON (((1029606.077 156073.814, 102957... 3.045214e+09 POLYGON ((1066963.473 157602.686, 1067059.264 ...
3 Brooklyn 741080.523166 1.937479e+09 MULTIPOLYGON (((1021176.479 151374.797, 102100... 1.937478e+09 POLYGON ((962679.120 165570.385, 962651.330 16...
1 Manhattan 359299.096471 6.364715e+08 MULTIPOLYGON (((981219.056 188655.316, 980940.... 6.364712e+08 POLYGON ((980499.119 178448.735, 979864.868 17...
2 Bronx 464392.991824 1.186925e+09 MULTIPOLYGON (((1012821.806 229228.265, 101278... 1.186926e+09 POLYGON ((992724.911 240962.362, 992700.941 24...
In [16]:
gdf2_attr = gdf2.join(nb_pkt["counts"])
gdf2_attr
Out[16]:
BoroCode BoroName Shape_Leng Shape_Area geometry area bfr counts
0 5 Staten Island 330470.010332 1.623820e+09 MULTIPOLYGON (((970217.022 145643.332, 970227.... 1.623822e+09 POLYGON ((903234.894 123347.784, 903178.057 12... NaN
1 4 Queens 896344.047763 3.045213e+09 MULTIPOLYGON (((1029606.077 156073.814, 102957... 3.045214e+09 POLYGON ((1066963.473 157602.686, 1067059.264 ... 223.0
2 3 Brooklyn 741080.523166 1.937479e+09 MULTIPOLYGON (((1021176.479 151374.797, 102100... 1.937478e+09 POLYGON ((962679.120 165570.385, 962651.330 16... 65.0
3 1 Manhattan 359299.096471 6.364715e+08 MULTIPOLYGON (((981219.056 188655.316, 980940.... 6.364712e+08 POLYGON ((980499.119 178448.735, 979864.868 17... 231.0
4 2 Bronx 464392.991824 1.186925e+09 MULTIPOLYGON (((1012821.806 229228.265, 101278... 1.186926e+09 POLYGON ((992724.911 240962.362, 992700.941 24... 259.0

Typ geometrii¶

Zbiór danych może zawierać kilka kolumn definiujących geometrię obiektów. Funkcja set_geometry ustawia, która kolumna będzie definiować geometrię.

In [17]:
#do gdf dodam kolumne w ktorej wylicze centroidy
gdf['centroid'] = gdf.centroid
gdf['boundary'] = gdf.boundary
gdf
Out[17]:
BoroCode BoroName Shape_Leng Shape_Area geometry area bfr centroid boundary
0 5 Staten Island 330470.010332 1.623820e+09 MULTIPOLYGON (((970217.022 145643.332, 970227.... 1.623822e+09 POLYGON ((903234.894 123347.784, 903178.057 12... POINT (941639.450 150931.991) MULTILINESTRING ((970217.022 145643.332, 97022...
1 4 Queens 896344.047763 3.045213e+09 MULTIPOLYGON (((1029606.077 156073.814, 102957... 3.045214e+09 POLYGON ((1066963.473 157602.686, 1067059.264 ... POINT (1034578.078 197116.604) MULTILINESTRING ((1029606.077 156073.814, 1029...
2 3 Brooklyn 741080.523166 1.937479e+09 MULTIPOLYGON (((1021176.479 151374.797, 102100... 1.937478e+09 POLYGON ((962679.120 165570.385, 962651.330 16... POINT (998769.115 174169.761) MULTILINESTRING ((1021176.479 151374.797, 1021...
3 1 Manhattan 359299.096471 6.364715e+08 MULTIPOLYGON (((981219.056 188655.316, 980940.... 6.364712e+08 POLYGON ((980499.119 178448.735, 979864.868 17... POINT (993336.965 222451.437) MULTILINESTRING ((981219.056 188655.316, 98094...
4 2 Bronx 464392.991824 1.186925e+09 MULTIPOLYGON (((1012821.806 229228.265, 101278... 1.186926e+09 POLYGON ((992724.911 240962.362, 992700.941 24... POINT (1021174.790 249937.980) MULTILINESTRING ((1012821.806 229228.265, 1012...
In [18]:
gdf.plot()
Out[18]:
<Axes: >
No description has been provided for this image
In [19]:
#zmieniam geometrie na punkty 
gdf = gdf.set_geometry("centroid")
gdf.plot()
Out[19]:
<Axes: >
No description has been provided for this image
In [20]:
gdf["bfr_centroid"] = gdf.buffer(5000)
gdf["bfr_centroid"].plot(color = "green")                           
Out[20]:
<Axes: >
No description has been provided for this image
In [21]:
gdf = gdf.set_geometry("boundary")
gdf.plot()
Out[21]:
<Axes: >
No description has been provided for this image
In [22]:
#ustawiam geometrię z powrotem na warstwę poligonową
gdf = gdf.set_geometry("geometry")
gdf.plot()
Out[22]:
<Axes: >
No description has been provided for this image

Wyświetlanie¶

In [23]:
gdf.plot(column = "area", cmap="Reds", legend = True)
Out[23]:
<Axes: >
No description has been provided for this image

Wyświetlanie kilku warstw¶

In [24]:
#ax = ax okresla co bedzie wyswietlone w podkladzie, wyswietli poligony a na nie nalozy punkty
ax = gdf["geometry"].plot()
gdf["centroid"].plot(ax = ax, color = "black")
Out[24]:
<Axes: >
No description has been provided for this image
In [25]:
ax = gdf["bfr"].plot(color = "grey")
gdf["boundary"].plot(ax=ax, color="white", linewidth = 0.3)
gdf["bfr_centroid"].plot(ax=ax, color="green")
gdf["centroid"].plot(ax=ax, color="red")
Out[25]:
<Axes: >
No description has been provided for this image

Kolejność wyświetlania warstw¶

Przy wyświetlaniu wielu warstw warto użyć parametru zorder, który wskazuje kolejność wyświetlania warstw. W poniższym przykładzie przypisz zorder=3 do bfr_centroid oraz zorder=4 do zmiennej centroid.

In [26]:
ax = gdf["bfr"].plot(color = "grey")
gdf["boundary"].plot(ax=ax, color="white", linewidth = 0.3, zorder=2)
gdf["bfr_centroid"].plot(ax=ax, color="green", zorder=3)
gdf["centroid"].plot(ax=ax, color="red", zorder=4)
Out[26]:
<Axes: >
No description has been provided for this image

Zapisywanie do pliku¶

In [31]:
ax = gdf["bfr"].plot(color = "grey")
gdf["boundary"].plot(ax=ax, color="white", linewidth = 0.3, zorder=2)
gdf["bfr_centroid"].plot(ax=ax, color="green", zorder=3)
gdf["centroid"].plot(ax=ax, color="red", zorder=4)
plt.savefig('fig1.png', dpi=300)
No description has been provided for this image

Mapy interaktywne¶

  • https://geopandas.org/en/stable/docs/user_guide/interactive_mapping.html
In [27]:
## Mapy interaktywne
gdf.explore()
Out[27]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [28]:
gdf.explore(
    column="BoroName",  # make choropleth based on "BoroName" column
    tooltip="BoroName",  # show "BoroName" value in tooltip (on hover)
    popup=True,  # show all values in popup (on click)
    tiles="CartoDB positron",  # use "CartoDB positron" tiles
    cmap="Set1",  # use "Set1" matplotlib colormap
    style_kwds=dict(color="black"),  # use black outline
)
Out[28]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Zadanie samodzielne¶

  • Z warstwy ny.gpkg wczytaj warstwę punktową theatre.
  • Ujednolić układ współrzędnych między wartstą theatre a warstwą poligonową gdf (granice dzielnic w NYC).
  • Połącz obie warstwy
  • W której dzielnicy znajduje się najwięcej, a w której najmniej teatrów?
  • Przedstaw liczbę teatrów w poszczególnych dzielnicach na mapie. Upewnij się, że wszystkie dzielnice są wyświetlone na mapie.