Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hello,
I am pulling the data from an API the code I am using is below. The API has more that 3 million rows and its is consuiming lot of time. I just want to pull the initial 10000 rows for each month. My actual query is bwlow which is pulling 400K rows. I just want to pull 10000 rows for each month. Kindly help me change the query.
let
Source = Json.Document(Web.Contents("https://data.cityofnewyork.us/resource/wn58-xk44.json?$limit=400000&$offset=0")),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"descriptor", "incident_zip", "road_ramp", "created_date", "location", "city", "cross_street_2", "bridge_highway_name", "closed_date", "park_facility_name", "intersection_street_1", "landmark", "agency_name", "latitude", "x_coordinate_state_plane", "bridge_highway_direction", "resolution_description", "taxi_pick_up_location", "incident_address", "location_type", "cross_street_1", "community_board", "agency", "park_borough", "borough", "street_name", "complaint_type", "longitude", "bbl", "status", "unique_key", "y_coordinate_state_plane", "resolution_action_updated_date", "address_type", "facility_type", "intersection_street_2", "bridge_highway_segment", "open_data_channel_type"}, {"descriptor", "incident_zip", "road_ramp", "created_date", "location", "city", "cross_street_2", "bridge_highway_name", "closed_date", "park_facility_name", "intersection_street_1", "landmark", "agency_name", "latitude", "x_coordinate_state_plane", "bridge_highway_direction", "resolution_description", "taxi_pick_up_location", "incident_address", "location_type", "cross_street_1", "community_board", "agency", "park_borough", "borough", "street_name", "complaint_type", "longitude", "bbl", "status", "unique_key", "y_coordinate_state_plane", "resolution_action_updated_date", "address_type", "facility_type", "intersection_street_2", "bridge_highway_segment", "open_data_channel_type"}),
#"Expanded location" = Table.ExpandRecordColumn(#"Expanded Column1", "location", {"latitude", "human_address", "needs_recoding", "longitude"}, {"location.latitude", "location.human_address", "location.needs_recoding", "location.longitude"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded location",{{"descriptor", type text}, {"incident_zip", Int64.Type}, {"road_ramp", type any}, {"created_date", type datetime}, {"location.latitude", type number}, {"location.human_address", type text}, {"location.needs_recoding", type logical}, {"location.longitude", type number}, {"city", type text}, {"cross_street_2", type text}, {"bridge_highway_name", type any}, {"closed_date", type datetime}, {"park_facility_name", type text}, {"intersection_street_1", type text}, {"landmark", type text}, {"agency_name", type text}, {"latitude", type number}, {"x_coordinate_state_plane", Int64.Type}, {"bridge_highway_direction", type any}, {"resolution_description", type text}, {"taxi_pick_up_location", type any}, {"incident_address", type text}, {"location_type", type text}, {"cross_street_1", type text}, {"community_board", type text}, {"agency", type text}, {"park_borough", type text}, {"borough", type text}, {"street_name", type text}, {"complaint_type", type text}, {"longitude", type number}, {"bbl", Int64.Type}, {"status", type text}, {"unique_key", Int64.Type}, {"y_coordinate_state_plane", Int64.Type}, {"resolution_action_updated_date", type datetime}, {"address_type", type text}, {"facility_type", type text}, {"intersection_street_2", type text}, {"bridge_highway_segment", type any}, {"open_data_channel_type", type text}}),
#"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "created_date", "created_date - Copy"),
#"Changed Type1" = Table.TransformColumnTypes(#"Duplicated Column",{{"incident_zip", type text}, {"road_ramp", type text}, {"bridge_highway_name", type text}, {"bridge_highway_direction", type text}, {"taxi_pick_up_location", type text}, {"unique_key", type text}, {"created_date", type date}, {"created_date - Copy", type time}}),
#"Added Custom" = Table.AddColumn(#"Changed Type1", "Hours", each Time.Hour([#"created_date - Copy"])),
#"Changed Type2" = Table.TransformColumnTypes(#"Added Custom",{{"Hours", Int64.Type}})
in
#"Changed Type2"
Your help would be highly appreciated.
Thank you
To pull only the initial 10000 rows for each month, you can modify the API request by using the $where parameter to filter the data based on the created_date field. Here's an example query that fetches the first 10,000 rows for each month:
https://data.cityofnewyork.us/resource/wn58-xk44.json?$where=extract(month from created_date) <= 10 AND row_number <= 10000
This query fetches only the rows where the created_date month is less than or equal to October (i.e., the first 10 months of the year) and where the row number is less than or equal to 10,000.
You can modify the $where clause to fetch data for different months or change the row limit as needed.
In Power Query, you can modify the Web.Contents function to include the $where parameter like this:
let url = "https://data.cityofnewyork.us/resource/wn58-xk44.json?$where=extract(month from created_date) <= 10 AND row_number <= 10000", Source = Json.Document(Web.Contents(url)), #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error), ... in #"Changed Type2"
Replace the url variable with the modified API query. This should limit the data to the first 10,000 rows for each month.
Best regards,
Isaac Chavarria
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly
Check out the September 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
72 | |
67 | |
42 | |
28 | |
20 |