Category: OCI

  • Using Python Pandas to turn ISO Country Codes into a string to use as values for a SQL Query

    Using Python Pandas to turn ISO Country Codes into a string to use as values for a SQL Query

    Summary, code, resources

    Problem

    While querying a table (based on this dataset) with SQL, you realize one of your columns uses 3-character ISO Country Codes. However, some of these 3-character codes aren’t countries but geographical regions or groups of countries, in addition to the actual country codes. How can you filter out rows so you are left with the countries only?

    Answer

    Use the Python Pandas library to scrape ISO country codes and convert the values to one single string. Then use that string as values for a subsequent SQL query (possibly something like this):

    SELECT * FROM [your_table]
    WHERE country_code IN ([values from the generated list-as-string separated by commas and encased by single / double quotes]);

    Code

    # Libraries used in this code
    from bs4 import BeautifulSoup
    import requests
    import csv 
    import pandas as pd 
    
    # I found these ISO country codes on the below URL. Pandas makes it easy to read HTML and manipulate it. Very cool!
    iso_codes = pd.read_html("https://www.iban.com/country-codes")
    
    # I create a data frame, starting at an index of 0.
    df = iso_codes[0]
    
    # But really, all I care about is the 3-digit country code. So I'll make that the df (dataframe) and strip out the index
    df = df['Alpha-3 code'].to_string(index=False)
    
    # From here, I'll save this little guy as a text file.
    with open("./countries.txt", "w") as f:
        f.write(df)
    
    # I'll set up a list. *** This was my approach, but if you find a better way, feel free to comment or adjust. ***
    my_list = []
    
    # Then I'll open that text file and read it in.
    file = open("./countries.txt", "r")
    countries = file.read()
    
    # I need to remove the "new line" identifiers, so I'm doing that here. 
    my_list = countries.split('\n')
    
    # Once I do that, I can create two new strings. I do this with f-Strings. Great article on using them here: https://realpython.com/python-f-strings/ 
    
    # I have two options here: one where the codes are contained by single quotes, the other with double quotes. Oracle Autonomous Database likes single quotes, but your DB may differ.
    
    countries_string_single_quotes = ','.join(f"'{x}'" for x in my_list)
    
    countries_string_double_quotes = ','.join(f'"{x}"' for x in my_list)
    
    # From here, I take those strings and save them in a text file. You don't have to do this; you can print and copy/paste the string. But this might be an excellent addition if you want to refer to these later without running all the code. 
    
    with open("./countries_as_list_single_quotes.txt", "a") as f:
        f.write(countries_string_single_quotes)
    
    with open("./countries_as_list_double_quotes.txt", "a") as f:
        f.write(countries_string_double_quotes)

    GitHub repo details

    You can find the code from this post in my GitHub repository. The repository consists of the following:

    • The Python code I created for solving this problem
    • A countries.txt file, which is produced midway through the code (temporary placeholder for later processing)
    • ‘Single quotes’ .txt file – the 3-character ISO Country Codes are formatted as a string. The values are enclosed by single quotes; commas throughout
    • “Double quotes” .txt file – the 3-character ISO Country Codes are formatted as a string. The values are enclosed by double quotes; commas throughout

    Resources


    The in-depth walkthrough

    Backstory

    A few days ago, I stumbled upon this question in the r/datasets subreddit:

    I spent most of the morning figuring out how I would go about this, and after some trial and error, I devised a plan. I decided to take the list of ISO Country Codes (which I found here) and use them as values for filtering in a SQL statement (later on in Oracle SQL Developer Web).

    After some research, I figured out the proper SQL syntax for a successful query.

    SELECT * FROM [your_table]
    WHERE country_code IN ([values from the generated list-as-string separated by commas and encased by single / double quotes]);

    From there, I knew I needed to work backward on those ISO Country Codes. Meaning I needed to take something that looked like this:

    And turn it into something more workable. It turns out that grabbing this was pretty straightforward. I’m using Pandas primarily for this exercise, but first, I need to import some libraries:

    # Libraries used in this code
    from bs4 import BeautifulSoup
    import requests
    import csv 
    import pandas as pd 

    Next, I’ll use Pandas’ read_html function (this feels like cheating, but it’s incredible) to read in the table.

    # I found these ISO country codes on the below URL. Pandas makes it easy to read HTML and manipulate it. Very cool!
    iso_codes = pd.read_html("https://www.iban.com/country-codes")
    
    # I create a data frame, starting at an index of 0.
    df = iso_codes[0]

    This is wild, but this is what the printout looks like:

    pandas to_html printout of ISO codes table, chris hoina, senior product manager, oracle rest apis, oracle autonomous database, sql developer web, ORDS
    The Pandas read_html() the function is powerful.

    If you squint, you can see an “Alpha-2 code” and an “Alpha-3 code” column in the image. From here, I need to isolate the 3-code column. So I reshaped the data frame by making it a single column; dropping the index (this is optional, you could keep the index if you needed it; perhaps you wanted to create a separate table in your database).

    # But really, all I care about is the 3-digit country code. So I'll make that the df (dataframe) and strip out the index
    df = df['Alpha-3 code'].to_string(index=False)

    I’ll save this data frame as a .txt file.

    # From here, I'll save this little guy as a text file.
    with open("./countries.txt", "w") as f:
        f.write(df)

    This is only temporary (FYI: this is the only way I could figure out how to do this). It’ll look like this:

    country codes as a temporary text file, chris hoina, senior product manager, oracle rest apis, oracle autonomous database, sql developer web, ORDS
    The temporary .txt file of 3-character ISO Country Codes.

    Next, I take that temporary text file and read it in. I’m going to add it to a list, so I’ll first create the empty list (aptly named “my_list“). I also need to remove the newline characters from the list; otherwise, if I don’t, then when I create my string of values (that comes in the final step), the string will look like this:

    countries string with newline characters, chris hoina, senior product manager, oracle rest apis, oracle autonomous database, sql developer web, ORDS
    The “countries” string with “\n” characters.

    I remove the newline characters with this piece of code:

    # I need to remove the "new line" identifiers, so I'm doing that here. 
    my_list = countries.split('\n')

    The almost string of values will look like this:

    viewing ouput of my_list, chris hoina, senior product manager, oracle rest apis, oracle autonomous database, sql developer web, ORDS
    New line characters have now been removed.

    I use F-Strings to create the following two strings; countries_strings_single_quotes and countries_strings_double_quotes, respectively. Need to learn about F-Strings (or, more formally, Literal String Interpolation)? No problemo! Check out these three resources:

    The code for the F-Strings is below. I loop through my_list and separate the x (the things I’m iterating over) with commas (that’s the join).

    # Once I do that, I can create two new strings. I do this with f-Strings. Great article on using them here: https://realpython.com/python-f-strings/ 
    
    # I have two options here: one where the codes are contained by single quotes, the other with double
    # quotes. Oracle Autonomous Database likes single quotes, but your DB may differ.
    
    countries_string_single_quotes = ','.join(f"'{x}'" for x in my_list)
    
    countries_string_double_quotes = ','.join(f'"{x}"' for x in my_list)

    And now that I have these two objects (are they called objects??). I’ll save them each as a text file. One file has the 3-character codes surrounded by single quotes, the other with double quotes. The code:

    # From here, I take those strings and save them in a text file. You don't have to do this; you can print
    # and copy/paste the string. But this might be a nice addition if you want to refer to these later 
    # without running all the code. 
    
    with open("./countries_as_list_single_quotes.txt", "a") as f:
        f.write(countries_string_single_quotes)
    
    with open("./countries_as_list_double_quotes.txt", "a") as f:
        f.write(countries_string_double_quotes)

    The text files look like this now:

    side by side comparison of newly created text files, chris hoina, senior product manager, oracle rest apis, oracle autonomous database, sql developer web, ORDS
    The country codes are now presented in one long string. Pretty cool, eh?

    SQL time

    We have arrived! Let me show you what I can do now!

    I took the CSV data from the World Bank and loaded it into my Autonomous Database. Our returning intern Layla put together a video of how to do this; you can check it out here:

    Once my table was created, I did a SELECT [columns] FROM. Here you can see my “beginning state”.

    There are 266 entries; some are countries, and others are not. And if you recall, the original question asked how somebody could filter out the non-countries. Onto that next!

    This is the best part. I can take the string I made and use that in a SQL query such as this:

    SELECT * from ADMIN.REDDIT_TABLE
    WHERE COUNTRY_CODE IN('AFG','ALA','ALB','DZA','ASM','AND','AGO','AIA','ATA',
    'ATG','ARG','ARM','ABW','AUS','AUT','AZE','BHS','BHR','BGD','BRB','BLR','BEL',
    'BLZ','BEN','BMU','BTN','BOL','BES','BIH','BWA','BVT','BRA','IOT','BRN','BGR',
    'BFA','BDI','CPV','KHM','CMR','CAN','CYM','CAF','TCD','CHL','CHN','CXR','CCK',
    'COL','COM','COD','COG','COK','CRI','CIV','HRV','CUB','CUW','CYP','CZE','DNK',
    'DJI','DMA','DOM','ECU','EGY','SLV','GNQ','ERI','EST','SWZ','ETH','FLK','FRO',
    'FJI','FIN','FRA','GUF','PYF','ATF','GAB','GMB','GEO','DEU','GHA','GIB','GRC',
    'GRL','GRD','GLP','GUM','GTM','GGY','GIN','GNB','GUY','HTI','HMD','VAT','HND',
    'HKG','HUN','ISL','IND','IDN','IRN','IRQ','IRL','IMN','ISR','ITA','JAM','JPN',
    'JEY','JOR','KAZ','KEN','KIR','PRK','KOR','KWT','KGZ','LAO','LVA','LBN','LSO',
    'LBR','LBY','LIE','LTU','LUX','MAC','MKD','MDG','MWI','MYS','MDV','MLI','MLT',
    'MHL','MTQ','MRT','MUS','MYT','MEX','FSM','MDA','MCO','MNG','MNE','MSR','MAR',
    'MOZ','MMR','NAM','NRU','NPL','NLD','NCL','NZL','NIC','NER','NGA','NIU','NFK',
    'MNP','NOR','OMN','PAK','PLW','PSE','PAN','PNG','PRY','PER','PHL','PCN','POL',
    'PRT','PRI','QAT','REU','ROU','RUS','RWA','BLM','SHN','KNA','LCA','MAF','SPM',
    'VCT','WSM','SMR','STP','SAU','SEN','SRB','SYC','SLE','SGP','SXM','SVK','SVN',
    'SLB','SOM','ZAF','SGS','SSD','ESP','LKA','SDN','SUR','SJM','SWE','CHE','SYR',
    'TWN','TJK','TZA','THA','TLS','TGO','TKL','TON','TTO','TUN','TUR','TKM','TCA',
    'TUV','UGA','UKR','ARE','GBR','UMI','USA','URY','UZB','VUT','VEN','VNM','VGB',
    'VIR','WLF','ESH','YEM','ZMB','ZWE')
    ORDER BY COUNTRY_CODE ASC;

    Once I execute that SQL statement, I’m left with the countries from that list. I opened up the results in another window so you can see a sample.

    The end

    So yeah, that’s it! I don’t know if this was the best way to go about this, but it was fun. I’m curious (if you’ve made it this far), what do you think? How would you go about it? Let me know.

    And two more things: remember to share this and…

  • More fun with Medium story stats, JSON, Python, Pandas, and Oracle SQL Developer Web

    More fun with Medium story stats, JSON, Python, Pandas, and Oracle SQL Developer Web

    That’s right; I’m back again for yet another installment of this ongoing series dedicated to working with Medium.com story stats. I first introduced this topic in a previous post. Maybe you saw it. If not, you can find it here.

    Recap

    My end goal was to gather all story stats from my Medium account and place them into my Autonomous Database. I wanted to practice my SQL and see if I could derive insights from the data. Unfortunately, gathering said data is complicated.

    Pulling the data down was a breeze once I figured out where to look for these story statistics. I had to decipher what I was looking at in the Medium REST API (I suppose that was somewhat tricky). My search was mostly an exercise in patience (there was a lot of trial and error).

    I uploaded a quick video in the previous post. But I’ll embed it here so you can see the process for how I found the specific JSON payload.

    Obtaining the raw JSON

    Once I found that URL, I saved this JSON as a .json file. The images below show remnants of a JavaScript function captured with the rest of the JSON. I’m no JavaScript expert, so I can’t tell what this function does. But before I load this into my Autonomous Database (I’m using an OCI Free Tier account, you can check it out here if you are curious), it needs to go.


    README

    I am pointing out a few things that may seem convoluted and unnecessary here. Please take the time to read this section so you can better understand my madness. 
    
    FIRST: Yes, you can manually remove the [presumably] JavaScript saved along with the primary JSON payload (see above paragraphs). I'm showing how to do this in Python as a practical exercise. But I'm also leaving open the opportunity for future automation (as it pertains to cleaning data). 
    
    SECOND: When it comes to the Pandas data frame steps, of course, you could do all this in Excel, Numbers, or Sheets! Again, the idea here is to show you how I can clean and process this in Python. Sometimes doing things like this in Excel, Numbers, and Sheets is impossible (thinking about enterprise security here). 
    
    THIRD: Admittedly, the date-time conversion is hilarious and convoluted. Of course, I could do this in a spreadsheet application. That's not the point. I was showing the function practically and setting myself up for potential future automation. 
    
    FOURTH: I'll be the first to admit that the JSON > TXT > JSON > CSV file conversion is comical. So if you have any suggestions, leave a comment here or on my GitHub repository (I'll link below), and I'll attribute you!

    The code

    Explaining the code in context, with embedded comments, will be most illuminating.

    I’ve named everything in the code as literally as possible. In production, this feels like it might be impractical; however, there is no question about what the hell the code is doing! Being more literal is ideal for debugging and code maintenance.

    Here is the entire code block (so CTRL+C/CTRL+V to your heart’s content 😘). I’ll still break this down into discrete sections and review them.

    import csv
    import json
    import pandas as pd 
    import datetime
    from pathlib import Path
    
    # You'll first need to sign in to your account, then you can access this URL without issues: 
    # https://medium.com/@chrishoina/stats/total/1548525600000/1668776608433
    # NOTES:
    # Replace the "@chrishoina" with your username 
    # The two numbers you see are Unix Epochs; you can modify those as # needed; in my case, I
    # wanted to see the following:
    # * 1548525600000 - At the time of this post, this seems to be 
    # whenever your first post was published or when 
    # you first created a Medium account. In this case, for me, this 
    # was Sat, Jan/26/2019, 6:00:00PM - GMT
    # * 1665670606216 - You shouldn't need to change this since it will # just default to the current date.
    
    # For the conversion, I an Epoch Converter tool I found online: https://www.epochconverter.com/
    
    # Step 1 - Convert this to a,(.txt) file
    p = Path("/Users/choina/Documents/socialstats/1668776608433.json")
    p.rename(p.with_suffix('.txt'))
    
    # Step 2 - "read" in that text file, and remove those pesky
    # characters/artifacts from position 0 through position 15. 
    # I'm only retaining the JSON payload from position 16 onward.
    
    with open("/Users/choina/Documents/socialstats/1668776608433.txt", "r") as f:
    stats_in_text_file_format = f.read()
    
    # This [16:] essentially means grabbing everything in this range. Since
    # there is nothing after the colon; it will just default to the end (which is
    # what I want in this case).
    
    cleansed_stats_from_txt_file = stats_in_text_file_format[16:]
    
    print(cleansed_stats_from_txt_file)
    
    # This took me a day to figure out, but this text file needs to be encoded
    # properly, so I can save it as a JSON file (which is about to happen). I
    # always need to remember this, but I know that the json.dumps = dump
    # string, which json.dump = dump object. There is a difference, I'm not
    # the expert, but the docs were helpful. 
    
    json.dumps(cleansed_stats_from_txt_file)
    
    # Step 3 - Here, I create a new file, then indicate we will "w"rite to it. I take the 
    # progress from Step 2 and apply it here. 
    
    with open('medium_stats_ready_for_pandas.json', 'w') as f:
    f.write(cleansed_stats_from_txt_file)
    
    # Step 4 - Onto Pandas! We've already imported the pandas library as "pd."
    # We first create a data frame and name the columns. I kept the names
    # very similar to avoid confusion. I feared that timestampMs might be a
    # reserved word in Oracle DB or too close, so I renamed it. 
    
    df = pd.DataFrame(columns=['USERID', 'FLAGGEDSPAM', 'STATSDATE', 'UPVOTES', 'READS', 'VIEWS', 'CLAPS', 'SUBSCRIBERS']) 
    
    with open("/Users/choina/Documents/socialstats/medium_stats_ready_for_pandas.json", "r") as f: 
    data = json.load(f)
    data = data['payload']['value']
    
    print(data)
    
    for i in range(0, len(data)):
    df.loc[i] = [data[i]['userId'], data[i]['flaggedSpam'], data[i]['timestampMs'], data[i]['upvotes'], data[i]['reads'], data[i]['views'], data[i]['claps'], data[i]['updateNotificationSubscribers']]
    
    df['STATSDATE'] = pd.to_datetime(df['STATSDATE'], unit="ms")
    
    print(df.columns)
    
    # Step 5 - use the Pandas' df.to_csv function and save the data frame as
    # a CSV file
    
    with open("medium_stats_ready_for_database_update.csv", "w") as f:
    df.to_csv(f, index=False, header=True)

    I used several Python libraries I use for this script:

    Step 1

    p =  Path("/Users/choina/Documents/socialstats/1668776608433.json")
    p.rename(p.with_suffix('.txt')

    Pathlib allows you to assign the file’s path to “p”. From there, I changed the .json file extension to a .txt extension.

    Note: Again, I'm sure there is a better way to do this, so if you're reading, leave a comment here or on my GitHub repository so I can attribute it to you 🙃. 

    The before and after of what this step looks like this:

    With that out of the way, I needed to remove that JavaScript “prefix” in the file. I do this in Step 2 (I got so fancy that I probably reached diminishing returns). My approach works, and I can repurpose this for other applications too!

    Step 2:

    # Step 2 - "read" in that text file, and remove those pesky
    # characters/artifacts from position 0 through position 15. Or in other
    # words, you'll retain everything from position 16 onward because that's
    # where the actual JSON payload is.
    
    with open("/Users/choina/Documents/socialstats/1668776608433.txt", "r") as f:
    stats_in_text_file_format = f.read()
    
    # This [16:] essentially means grabbing everything in this range. Since
    # there is nothing after the colon; it will just default to the end (which is
    # what I want in this case).
    cleansed_stats_from_txt_file = stats_in_text_file_format[16:]
    
    print(cleansed_stats_from_txt_file)
    
    # This took me a day to figure out, but this text file needs to be
    # appropriately encoded to save as a JSON file (which is about to
    # happen). I always forget the difference between "dump" and "dumps";
    # json.dumps = dump string, whereas json.dump = dump object. There is
    # a difference, I'm not the expert, but the docs were helpful (you should
    # read them). 
    
    json.dumps(cleansed_stats_from_txt_file)
    
    json with characters that need to be removed in vs code context chris hoina senior product manager oracle rest apis database tools autonomous database
    I needed to remove these remnants from the Medium JSON response

    While this initially came through as a JSON payload, those first 0-15 characters had to go.

    FULL DISCLAIMER: I couldn't figure out how to get rid of this while it was still a JSON file hence why I converted this to a text file (this was the only way I could figure it out).

    I captured position 16 to infinity (or the end of the file, whichever occurs first), then I re-encoded the file as JSON (I interpreted this as “something the target machine can read and understand as JSON“).

    OPEN SEASON: CompSci folks, please roast me in the comments if I'm wrong. 

    Step 3

    # Step 3 - I create a new file, then I'll "w"rite to it. I took the result from Step 2 and applied it here. 
    
    with open('medium_stats_ready_for_pandas.json', 'w') as f: 
    f.write(cleansed_stats_from_txt_file)

    I’m still at the data-wrangling portion of this journey, but I’m getting close to the end. I’ll create a new JSON file, take the parts of the (freshly encoded) text file I need, and then save them as that new JSON file.

    Step 4

    # Step 4 - Onto Pandas! We've already imported the pandas library as "pd"
    
    # I first create a data frame and name the columns. I kept the names 
    # similar to avoid confusion. I feared that timestampMs might be a
    # reserved word in Oracle DB or too close, so I renamed it. 
    
    df = pd.DataFrame(columns=['USERID', 'FLAGGEDSPAM', 'STATSDATE', 'UPVOTES', 'READS', 'VIEWS', 'CLAPS', 'SUBSCRIBERS']) 
    
    with open("/Users/choina/Documents/socialstats/medium_stats_ready_for_pandas.json", "r") as f: 
    data = json.load(f)
    data = data['payload']['value']
    
    print(data)
    
    for i in range(0, len(data)):
    df.loc[i] = [data[i]['userId'], data[i]['flaggedSpam'], data[i]['timestampMs'], data[i]['upvotes'], 
    data[i]['reads'], data[i]['views'], data[i]['claps'], data[i]['updateNotificationSubscribers']]
    
    df['STATSDATE'] = pd.to_datetime(df['STATSDATE'], unit="ms")
    
    print(df.columns)

    I won’t teach Pandas (and honestly, you do NOT want me to be the one to teach you Pandas), but I’ll do my best to explain my process. I first created the structure of my data frame (“df” in this case). And then, I named all the column headers (these can be anything, but I kept them very close to the ones found in the original JSON payload).

    I then opened the newly-saved JSON file and extracted what I needed.

    NOTE: I got stuck here for about a day and a half, so let me explain this part. 

    The data['payload']['value'] refers to the key and value in this particular {key: value} pair. This approach allowed me to grab all the values of “value“. This image explains what I started with (on the left) and what I ended up with (on the right).

    retrieving all the values in the payload-value key-value pair, chris hoina, ords, oracle rest apis, database tools, sql developer web
    The before and after JSON payload

    You’ll notice a {"success": true} key: value pair. With this method, I removed that pair and shed others at the end of the JSON payload.

    three additional key value pairs i remove from the json, chris hoina, senior product manager, ords, oracle rest apis, database tools, sql developer web, oracle autonomous database
    Removing a great deal of trash

    I can’t take credit for organically coming up with this next part; Kidson on YouTube is my savior. I’d watch this video to understand what is happening in this piece of code entirely:

    for i in range(0, len(data)):
        df.loc[i] = [data[i]['userId'], data[i]['flaggedSpam'], data[i]['timestampMs'], data[i]['upvotes'],                                   
        data[i]['reads'], data[i]['views'], data[i]['claps'], data[i]['updateNotificationSubscribers']]

    In short, you take the values from the columns in the JSON file (above) and then put them into the column locations named in this piece of code:

    df = pd.DataFrame(columns=['USERID', 'FLAGGEDSPAM', 'STATSDATE', 'UPVOTES', 'READS', 'VIEWS', 'CLAPS', 'SUBSCRIBERS'])  

    For instance, the "userId" values in the JSON file will all go into the 'USERID' column in the Pandas data frame. And the same thing will happen for the other values and associated (Pandas data frame) columns.

    Finally, I changed the date (which, if you recall, is still in this Epoch format) with the Datetime library to a more friendly, readable date. Using this code:

    df['STATSDATE'] = pd.to_datetime(df['STATSDATE'], unit="ms")

    Step 5

    with open("medium_stats_ready_for_database_update.csv", "w") as f:
        df.to_csv(f, index=False, header=True)

    I’m at the home stretch now. I take everything I’ve done in Pandas and save it as a CSV file. I wanted to keep the headers but ditch any indexing. The clean CSV file will look like this:

    completed csv ready for oracle autonomous database chris hoina senior product manager oracle rest apis database tools autonomous database
    Cleaned, tidy CSV ready for Data Load via SQL Developer Web

    Step 6

    Lastly, I logged into SQL Developer Web and clicked the new Data Load button (introduced in Oracle REST Data Services version 22.3) to upload the CSV file into a new table. The Autonomous Database automatically infers column names and data types. I slightly modified the "statsdate" column (honestly, I could have left it alone, but it was easy enough to change).

    Before and After

    And that’s it! Once uploaded, I can compare what I did previously to what I have achieved most recently. And both ways are correct. For instance, depending on your requirements, you can retain the JSON payload as a CLOB (as seen in the first image) or a more traditional table format (as seen in the second image).

    Wrap up

    If you’ve made it this far, congrats! You should now have two ways to store Medium stats data in a table (that lives in the Oracle Autonomous Database) either as:

    • a CLOB
    • an OG table

    And if you’d like to review the code, you can find it here.

    And until next time 🫠, happy querying.

  • Fun with Python GET requests, Medium stats, and the Oracle Autonomous Database

    Fun with Python GET requests, Medium stats, and the Oracle Autonomous Database

    I feel so silly for posting this because you’ll quickly realize that I will have to leave things unfinished for now. But I was so excited that I got something to work, that I had to share!

    If you’ve been following along, you know you can always find me here. But I do try my best to cross-post on other channels as well:

    But given that everything I do supports the development community, audience statistics are always crucial to me. Because of this, I’ll periodically review my stats on this site and the others to get a feel for the most popular topics.

    Python continues to be huge. Really anything Python and REST API related. My Python and Filtering in Queries and Python POST Requests posts seemed to generate A LOT of interest.

    I even did a RegEx post a while back that was pretty popular too. Thankfully it wasn’t that popular, as it pained me to work through Regular Expressions. 

    I can quickly review site statistics on this blog, but other places, like Medium, are more challenging to decipher. Of course, you can download your Audience stats, but sadly not your Story stats 😐.

    Audience-Stats-in-Medium-Chris-Hoina-ORDS-Oracle-REST-Database-Actions-Oracle-Autonomous-Database
    Audience stats download, but no Story stats download.

    Undeterred, I wanted to see if it was somehow possible to acquire my Story stats. And it is possible, in a way

    Show and tell

    If after you log into your Medium account, navigate to your stats page, open up the developer tools in your browser and navigate to your “Console.” From there, reload the page and simply observe all the traffic.

    You’ll see a bunch of requests:

    • GET
    • POST
    • OPTION (honestly, I’ve no idea what this is, but I also haven’t looked into it yet)

    My thought was that the stats content was produced through (or by) one of these API requests. So yes, I (one at a time) expanded every request and reviewed the Response Body of each request. I did that until I found something useful. And after a few minutes, there it was:

    The-magic-GET-request-in-Medium-Chris-Hoina-Senior-Product-Manager-ORDS-Oracle-REST-APIs-Oracle-Autonomous-Database
    The magic GET request.

    I confirmed I had struck gold by taking this URL, placing it in a new browser window, and hitting Enter. And after selecting “Raw Data,” I saw this:

     raw-json-story-stats-from-medium-chris-hoina-senior-product-manager-ords-oracle-rest-apis-autonomous-database
    Double-checking the raw JSON.

    Indeed, we see my Story stats. But the final two paths in the URL made no sense to me.

    alan-turing-mode-chris-hoina-senior-product-manager-ords-oracle-rest-apis-autonomous-database

    The paths looked similar; I had no choice but to activate Turing Mode™.

    prefixes-of-these-numbers-look-the-same-chris-hoina-senior-product-manager-ords-oracle-rest-apis-autonomous-database

    I could see these numbers were similar, so I lined them up in my text editor and saw that they shared the same 166 prefixes. I don’t know much about machine-readable code, but since what was appearing on my screen was the last 30 days, I thought this might be some sort of date. But I’d never seen anything like this, so I wasn’t 100% sure.

    Unix Time Stamps

    After about 20 mins of searching and almost giving up, I found something in our Oracle docs (a MySQL reference guide of all places) that referenced Unix Time Stamps. Eureka!

    Unix-time-stamps-in-oracle-mysql-guide-chris-hoina-oracle-rest-apis-ords-autonomous-database
    About Unix time stamps in the Oracle MySQL docs.

    Success, I’d found it. So I searched for a “Unix time stamp calculator” and plugged in the numbers. My hunch was correct; it was indeed the last thirty days!

    verifying-the-unix-time-stamps-in-the-medium-get-request-ords-oracle-REST-APIs-autonomous-database-chris-hoina-senior-product-manager
    Verifying the Unix Time Stamp.

    So now I’m wondering if I change that leading date in the GET request will it allow me to grab all my story statistics from January 2022 till now? Oh, hell yeah, it will!

     testing-get-request-with-updated-unix-time-stamp-chris-hoina-senior-product-manager-oracle-rest-apis-autonomous-database
    All my Story stats from Jan 2022 to the present.

    End of the line

    Right, so here is where I have to leave it open-ended. I had a finite amount of time to work on this today, but what I’d like to do is see if I can authenticate with Basic Authentication into my Medium account. And at least get a 200 Response Code. Oh wait, I already did that!?

     testing-basic-authentication-with-medium-and-python-requests-chris-hoina-senior-product-manager-ords-oracle-rest-apis-autonomous-database
    Getting that sweet, sweet 200 Response Code.

    And now the Python code!

    import requests
    import json 
    from requests.auth import HTTPBasicAuth
    
    url = "https://medium.com/m/signin"
    # I found this to work even if I typically sign on through 
    # the Google Single-sign-on. I just used the same email/password 
    # I do when I login directly to google (Gmail). 
    user = "[Your login/email]"
    password = "[Your password]"
    
    r = requests.get(url, auth=HTTPBasicAuth(user, password))
    
    print(r)
    
    
    # I found this URL in the console but then removed everything after
    # the query string (the "?"), and used that for the requests URL
    
    # "/m/signin?operation=login&redirect=https%3A%2F%2Fmedium.com%2F&source=--------------------------lo_home_nav-----------"

    You’re probably wondering how I found the correct URL for the Medium login page. Easy, I trolled the Console until I found the correct URL. This one was a little tricky, but I got it to work after some adjusting. I initially found this:

    "/m/signin?operation=login&redirect=https%3A%2F%2Fmedium.com%2F&source=--------------------------lo_home_nav-----------"

    And since I thought everything after that “?” was an optional query string, I just removed it and added the relevant parts to Medium’s base URL to get this:

    https://medium.com/m/signin

    Next steps

    From here, I’d like to take that JSON object and either:

    • use the Python Pandas library to clean up before loading into my Autonomous Database via ORDS, or
    • extract the existing JSON (CLOB) with SQL (as was detailed in this Oracle Community forum post)

    If I want to keep it as is, I know I can load the JSON with a cURL command and an ORDS Batch Load API with ease. I dropped this into my Autonomous Database (Data Load) to see what it would look like:

    CLOB-in-my-Autonomous-Database-Chris-Hoina-Senior-Product-Manager-ORDS-Oracle-REST-APIs-Database-Actions
    My CLOB.

    We do something very similar in the Oracle LiveLabs workshop (I just wrote about it here). You can access the workshop here!

    I’ll have a follow-up to this. But for now, this is the direction I am headed. If you are reading this, and want to see more content like this, let me know! Leave a comment, retweet, like, whatever. So that I know I’m not developing carpal tunnel for no reason 🤣.

  • Oracle REST APIs: A LiveLabs review

    Oracle REST APIs: A LiveLabs review

    Recently Jeff and I were invited by the Oracle Developers and Developer Relations teams to do a walkthrough of a LiveLabs workshop, “How to Build Powerful and Secure REST APIs for Your Oracle Autonomous Database.”

    We spent about 90 minutes moving through selected labs in the workshop. Luckily they recorded it for us; you can watch it in all its glory here.

    If that video piques your interest, I encourage you to complete the workshop since it provides an excellent overview of Oracle REST Data Services APIs — specifically when working in Database Actions (in the Oracle Autonomous Database).


    About the workshop

    Workshop overview from sept 2022 ords api masterclass jeff smith and chris hoina oracle rest apis oracle autonomous database database tools dql developer web
    Labs 1, 2, and 7 are common across many workshops. These were our focus.

    The workshop consists of seven labs, but labs 3-6 were the main focus.

    Two approaches to Oracle REST APIs, Chris Hoina, Senior Product Manager, ORDS, Database Tools
    Two approaches to REST-enabling your Oracle database objects.

    We also wanted to highlight the two ways a user could create Oracle REST APIs in Database Actions (formerly SQL Developer Web). You can jump right in with auto-REST enabling or get creative by building your Resource Modules > Templates > Handlers.


    Workshop highlights

    I won’t walk through the labs in detail here, but what I will do is highlight areas that:

    • Were cool/worth revisiting, or
    • Have (or continue to) helped speed up my productivity in Database Actions (and through association with the Autonomous Database)
    The videos are queued up to the related topic.

    Lab 3

    Lab 3 walks you through connecting to an Autonomous Database with Database Actions. From there, you create a table from a CSV file. And finally, you’ll auto-REST enable the table with simple mouse clicks.

    Data Loading

    I’ve found no less than three GUI-based ways to load data in Database Actions.

    Auto-REST enabling

    We are using mouse clicks for auto-REST enabling database objects in the Oracle Autonomous Database.

    Show Code toggle

    The new “Show Code” toggle switch in Database Actions.

    This feature isn’t limited to the SQL Worksheet; it's found across Database Actions!

    cURL command options for your environment

    cURL commands now provide Power Shell, Command Prompt, and Bash examples.

    Lab 4

    Lab 4 walks you through using a Batch Load API for loading two million+ rows into the table you previously created (in Lab 3). We also make a SQL procedure and later use PL/SQL to simulate a REST API call to the table.

    We briefly discussed the Cloud Shell and Code Editor (both in Oracle Cloud Infrastructure). Click the links to learn more, they are free and included in your OCI tenancy 😀.

    A crash course on query parameters

    Jeff has a helpful article here (one I reference A LOT).
    You can review our docs here (we mention it in several areas).

    Graduating from auto-REST

    A short discussion on when and why you may want to move away from auto-REST-enabled Oracle APIs to more customized Oracle REST APIs.

    Lab 5

    In Lab 5, you use Database Actions and the REST console to build a REST API using a parameterized PL/SQL procedure and SQL statement. We do this manually in the previous lab but then REST-enable it here (this is a continuation and refinement of the last lab).

    Bind variables in a POST request

    ORDS docs on Bind variables.
    This continues to confound me, so if you are in the same boat as me and you want me to do some more dedicated posts on this, let me know!

    Lab 6

    The goal of this lab was to educate you on Roles, Privileges, and OAuth 2.0 Client Authentication. Unfortunately, we ran out of time and had to speed through this final section. However, I did show off some of the OpenAPI functions within Database Actions.

    OpenAPI Specifications

    Specifically, we reviewed how you can view your Resource Modules in the OpenAPI view (displayed as a Swagger UI implementation). And view/execute handlers to observe their responses.

    We also mentioned how you can export a Resource Module in either PL/SQL code or the OpenAPI JSON code.


    I suspect you should be all set to complete this workshop (located here). But why stop the fun there? We have some other LiveLabs workshops that might interest you, too. You should check them out!

    Our other LiveLabs workshops

    The last workshop on the list is our newest one! So if you do attempt it, feel free to create an issue for enhancements (or if anything is unclear and needs updating) on my GitHub repository 😊!

    That’s all for now, but first…

  • Python POST requests three ways with Oracle REST Data Services (ORDS)

    Python POST requests three ways with Oracle REST Data Services (ORDS)

    The simple POST request

    It was bugging me that I couldn’t perform a simple Python POST request to an ORDS REST-enabled table.

    I wanted to send HTTPS POST requests to this table in my Oracle Autonomous Database, ORDS, Oracle REST APIs, Chris Hoina, Product Manager, Database Actions, Database Tools
    This one actually…

    I don’t mean to convey that this isn’t possible. Up until very recently, I wasn’t able to do this. Luckily I had a few hours free, so I took to the docs to do some reading. And wouldn’t you know it, like most things in tech, reading the documentation was a practical and valuable use of my time.

    Side note

    Let me pause here for a minute or two. I should add this disclaimer that none of what I share here uses OAuth 2.0 authentication. Of course, I wish it did, but I’m just not there yet (technical proficiency). In the future, I’d like to update this with security in mind1:

    1We have an API security section in our ORDS Best Practices guide.

    The other thing I’ll mention is that I’m going to include the SQL for creating this table along with all the Python code in my GitHub repo (I’ll also add any code updates to my repo! will also add any updates I make to this code to the repo!).

    Also, did you know that saving DDL with Database Actions is just a mouse click away?

    New User speed run

    This section is a bit of an aside, but I also created a new Python Developer user in one of my Autonomous Databases. It’s straightforward to do as the admin. Here is a “speed run”:

    POST haste

    After creating my new user, I created a “Python_Post” table. Super imaginative, right? And I kept things simple, naming the four columns (wait for it): "col1", "col2", "col3", and "col4".

    Columns in POST table in Oracle Autonomous Database to test requests in Python, ORDS, Oracle REST APIs, Chris Hoina, Product Manager, Database Actions, Database Tools
    Damn…he actually did that.

    Cheat codes

    I auto-REST enabled my table and reviewed a Bash cURL command so I could remind myself of the expected data (aka payload).

    Reviewing a Bash POST cURL command for reference in my Oracle Autonomous Database, ORDS, Oracle REST APIs, Chris Hoina, Product Manager, Database Actions, Database Tools
    A Bash cURL command for POST requests; can be used as later reference in your editor.

    I’ve noticed that if I don’t specify a ‘rowid’ the Autonomous Database automatically does this. SQL-newbies (like me) might appreciate this since we still don’t know all the best practices for database design (or DDL, for that matter)!

    My process might differ from yours, but I’ve used the cURL commands in Database Actions as a cheat. I’ve been copying/pasting this cURL command into my working Python file, so I don’t have to switch screens. Additionally, it helps to remind me what the {‘key’:‘value’} pairs are (even though I just created all this, I STILL can’t remember what I did).

    In this case, I’m referencing a POST request, but you could do this for the other HTTPS methods too:

    • GET ALL
    • GET
    • POST
    • BATCH LOAD
    • PUT
    • DELETE

    Moving on…

    I could omit the ‘rowid’ when making these POST requests. I don’t know if this is typical when working with databases, but this seems to work consistently (at least with the testing I did) with Python and the Requests library.

    If you weren’t taken aback by my imaginative table name and column names, then get ready because I’m about to blow your mind with this next bit. I created a payload for each of these POST request variations:

    Oracle Autonomous Database, ORDS, Oracle REST APIs, Chris Hoina, Product Manager, Database Actions, Database Tools
    And I’m just going to be as literal as possible to avoid confusion…is it lame? Yes. Does it work? Undecided, you tell me.

    In case you didn’t catch it, they were: payload1, payload2, and payload3.

    On feedback…

    Lack of communication is disturbing, Oracle Autonomous Database, ORDS, Oracle REST APIs, Chris Hoina, Product Manager, Database Actions, Database Tools

    I also included some feedback for myself. And I feel like this is a good practice because if you successfully POST something, how would you know? Conversely, the same would be true if you unsuccessfully POSTed something. If I were smart, I’d design a REST API in Database Actions that automatically includes some feedback. But I am not.

    If you want to read about this Implicit Parameter and others, click here.

    Luckily, the Python Requests library includes various feedback methods. I’ve included the following in my Python file:

    • status_code – returns the response status code
    • raise_for_status() – this will display the error message (if applicable); displays "None" if the request was successful
    • headers – returns the server’s response headers as a Python dictionary

    Different payloads

    My Python code is simple enough. It looks more than it is because I include three POST request variations. I’ve also tested payloads, and it seems like we can send a "payload" as:

    • params – query parameters (e.g. https://httpbin.org/get?key2=value2&key1=value1)
    • json – JSON data
    • data – a Python dictionary (in this use case)

    Oracle REST Data Services has a ton of documentation on filtering with query parameters… I’m still learning about it, but they are powerful. Check them out here.

    You can even send payloads consisting of nested items/objects, too (e.g., an array or list as one of your values). I’m sure this violates normalization rules, but it’s possible. In my case, I didn’t do this; I just stuck to my four columns.

    Executing the code

    After I executed the code, I received the following feedback for these POST requests:

    Feedback from post to my Oracle Autonomous Database, ORDS, Oracle REST APIs, Chris Hoina, Product Manager, Database Actions, Database Tools
    If you squint, you can see three separate responses. I’ve only boxed one, but you should be able to point out the other two.

    You should see:

    • a 201 status code, which indicates “the request has been fulfilled and has resulted in one or more new resources being created2.
    • None – which if there were a 400 error, that would show me the error message
    • {key: value} pairs displaying all header details

    2About the 201 Status Code in Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content

    Double checking

    While in VS Code, the POST requests appeared successful. But since I had access, I logged into Database Actions and manually inspected my table. Success!

    Double checking the post requests in database actions, Oracle Autonomous Database, ORDS, Oracle REST APIs, Chris Hoina, Product Manager, Database Actions, Database Tools
    Reviewing the new updates in Database Actions.

    And that’s it…zero to POST request in my Oracle Autonomous Database + ORDS in no time!

    Was this helpful?

    I have this code in my GitHub repository, so feel free to fork, download, or add comments. But I’ll include here too:

    import requests 
    
    url = "[Your ORDS URL goes here]"
    
    payload1 = {'col1': 'test1', 'col2':'test1', 'col3':'test1', 'col4': 'test1'}
    
    r1 = requests.post(url, json=payload1)
    
    print(r1.status_code)
    print(r1.raise_for_status())
    print(r1.headers) 
    
    payload2 = {'col1': 'test2', 'col2':'test2', 'col3':'test2', 'col4': 'test2'}
    
    r2 = requests.post(url, data=payload2)
    
    print(r2.status_code)
    print(r2.raise_for_status())
    print(r2.headers) 
    
    payload3 = {'col1': 'test3', 'col2':'test3', 'col3':'test3', 'col4': 'test3'}
    
    r3 = requests.post(url, params=payload3)
    
    print(r3.status_code)
    print(r3.raise_for_status())
    print(r3.headers) 

    And if you are still here — was this helpful? Want to see more or something else that is Python + ORDS-related? Let me know here in the comments, on Twitter, or by email!

  • Python + Folium + Oracle REST APIs (aka ORDS)

    Python + Folium + Oracle REST APIs (aka ORDS)

    Willkommen

    I stumbled upon a new [to me] python library called Folium. It’s a mapping tool that enables python developers (or is it programmers, which is less offensive?) to visualize data on a Leaflet map.

    About folium

    Folium makes it easy to visualize data that’s been manipulated in Python on an interactive leaflet map. It enables both the binding of data to a map for choropleth visualizations as well as passing rich vector/raster/HTML visualizations as markers on the map.

    The library has a number of built-in tilesets from OpenStreetMap, Mapbox, and Stamen, and supports custom tilesets with Mapbox or Cloudmade API keys. Folium supports Image/Video, GeoJSON and TopoJSON overlays.

    Folium docs

    …but what about leaflet.js?

    Leaflet.js is the leading open-source JavaScript library for mobile-friendly interactive maps. Out-of-the-box features include: tile layers/WMS, markers/popups, vector layers (polylines, polygons, circles, rectangles), image overlays, and GeoJSON

    Leaflet.js “Features” page

    Tilesets

    I highlighted the tilesets above, and I’m not sure if this is a GIS or a leaflet term, but it seems that tilesets refer to different map renderings. Folium includes the following:

    • OpenStreetMap
    • Stamen Terrain
    • Stamen Toner
    • Stamen Watercolor

    At the time of writing, I didn’t have API credentials for Mapbox Bright or Mapbox Control Room. But I’m assuming they still work if you have the appropriate API credentials.

    Excluding the Mapbox tiles, those packaged by default are more than sufficient:

    Tip: Read more on Stamen here, s'il vous plaît. They seem to be the original creators of these Stamen tiles. 

    Python + Folium + ORDS = Happiness

    Folium installation and quick start pages are straightforward – requiring minimal steps and effort. The same goes for installing the Live Server extension too. Once installed, you can right-click on a .html file to load it onto a local development server (so it displays in a new browser window).

    Let me jump right into how I set up Folium to work with ORDS. If you recall, most of the tables I have in my Autonomous Databases are REST-enabled (previous post working with ORDS).

    An ORDS Review

    Step 1 REST Enabling your table with ORDS in Database Actions Oracle Autonomous Database Chris Hoina Senior Product Manager Database Tools
    1. Once in Database Actions, right-click on an object and select REST > Enable.
    Step 2 REST Enabling your table with ORDS in Database Actions Oracle Autonomous Database Chris Hoina Senior Product Manager Database Tools
    2. Choose an alias (good security measure) along with Authentication (also good practice).
    Step 3 REST Enabling your table with ORDS in Database Actions Oracle Autonomous Database Chris Hoina Senior Product Manager Database Tools
    3. If you forget the URI, you can review the cURL command.
    Step 4 REST Enabling your table with ORDS in Database Actions Oracle Autonomous Database Chris Hoina Senior Product Manager Database Tools
    4. There you can copy/paste directly into a Command-Line or copy the URI portion and place into your browser’s address bar.

    …and now back to your regularly scheduled blog post

    Since I needed latitude and longitude coordinates for Folium, I decided to work with my Restaurant table. Put the REST endpoint (URI) directly into the browser; and you’ll see this:

    ORDS REST URI Endpoint from Oracle Autonomous Database, Chris Hoina Senior Product Manager Database Tools

    Afterward, I looked at the JSON object to see how it was structured (there are too many lists, dictionaries, arrays, etc. to remember amirite?!) because I’d need to iterate through all the items in the GET Response – to tease out what Folium needs.

    The play-by-play

    Printing the response (to see what it looks like)

    Walkthrough 1 Folium Python VS Code ORDS API GET Requests Oracle Autonomous Database Chris Hoina Senior Product Manager Database Tools
    • I’m covering all my bases here; reviewing the ORDS-provided request/object from a table in my Autonomous Database
    • Using json.() to “decode” it into a python dictionary
    • Please pay attention to line 10, where I actually create the base map

    Review of what Folium needs

    Walkthrough 2 Folium Python VS Code ORDS API GET Requests Oracle Autonomous Database Chris Hoina Senior Product Manager Database Tools
    • Looping through the python dictionary, pulling out what I need for the next step
    • Including lines 25-29 to create the pop-up markers.
    • The ('<i>{}</i>'.format(Resty)) allows me to pass the names to the string (stuff in quotes, plus the {}). The HTML for italics is optional.
    • In this case, ‘x’ and ‘y’ are the coordinates
    • Line 31 saves a .html file (which uses Bootstrap!)

    Newly rendered .HTML

    Walkthrough 3 Folium Python VS Code ORDS API GET Requests Oracle Autonomous Database Chris Hoina Senior Product Manager Database Tools
    • Using the Live Server extension, I can review the Folium map plus the pop-up markers!
    NOTE: This isn't the focus, but spoiler alert, I'll be putting this into a Flask application. I'll also build this out, so the individual markers/popups have charts/graphs for each restaurant. Again, all of which will be provided by ORDS-enabled tables (not so shameless plug).

    Summary

    After it is all said and done, it took me less than an afternoon to figure out how to make this work. So if you have a table with latitude and longitude, what are you waiting for?! REST-enable it with ORDS, and you can come up with a very quick demo on your own!

    Also, I didn’t have to create any database models, the dependencies are minimal (a few python libraries), and an Always Free OCI account is really all you need to get started.

    Code

    import json
    import requests
    import folium
    
    m = folium.Map(location=[35.78742648626059, -78.78122033558192], zoom_start=12, tiles="Stamen Watercolor")
    
    response = requests.get('your ORDS enpoint/URI').json()
    
    # print(type(response))
    # print(response)
    
    for entry in response['items']:
        Resty = entry['name']
        Lat = entry['y']
        Long = entry['x']
        
        folium.Marker(
    
            location=[Lat, Long], 
            popup = folium.Popup('<i>{}</i>'.format(Resty))
    #this line is optional/an alternative to the above line
            # popup = folium.Popup('<i>{}</i>'.format(entry['name']))
    
                    ).add_to(m)
            
        m.save("index.html")

    [I did not] reinvent the wheel

    Lastly, much of what I learned about Folium came from the following two videos (I recommend bookmarking them for reference):

    Follow me…

    That’s it for now folks!

    Roger Rabbit Please follow me, Chris Hoina, Senior Product Manager, ORDS, Oracle Database Tools
  • Working with WTForms and Oracle REST Database Services (ORDS) APIs

    Working with WTForms and Oracle REST Database Services (ORDS) APIs

    Welcome Back

    The title pretty much speaks for itself, and if you’ve been following along, you’d know that I’m working on a demo application that uses Flask (a web application microframework) connected to my Oracle Autonomous Database (ADB) via Oracle REST Data Services (ORDS) APIs.

    One of Flask’s strengths is that it allows you to extend functionality and allows for a good bit of customization. But…

    By default, Flask does not include a database abstraction layer, form validation or anything else where different libraries already exist that can handle that. Instead, Flask supports extensions to add such functionality to your application as if it was implemented in Flask itself.

    Flask Docs

    Why ADB and ORDS?

    I’m primarily using this specific configuration (with ORDS and ADB) for obvious reasons. But secondarily, I haven’t come across anything that actually references the Oracle ADB or ORDS (in the context of Flask, WTForms, and python of course).

    Most of the online tutorials I’ve seen and books I’ve read reference SQLlite or (in one case) MongoDB for the database. SQLite does seem like a good entry point, especially if you are putting together a proof of concept. But as things scale, I suspect you’ll reach diminishing performance returns with SQLite. In fact, the Flask documentation mentions exactly that:

    Python comes with built-in support for SQLite in the sqlite3 module. SQLite is convenient because it doesn’t require setting up a separate database server and is built-in to Python. However, if concurrent requests try to write to the database at the same time, they will slow down as each write happens sequentially. Small applications won’t notice this. Once you become big, you may want to switch to a different database.

    Flask Docs

    That last sentence is significant. I know ORDS can be set up with High Availability in mind, so why not take advantage of that from the beginning? Also, with my ADB, the effort to create a database schema and tables is very minimal (fun even).

    Regarding, MongoDB, obviously, I’m biased but I’m also disorganized. And I’m still a “technical toddler.” So having to download, install, and configure MongoDB Community Edition plus the MongoDB Compass GUI would end up being an unnecessary step for me (I actually ended up doing this for a previous tutorial).

    Regardless, I already have my OCI tenancy and my Autonomous Database, and I continue to get more and more comfortable with our ecosystem.

    Sign up for an Always Free OCI account, Oracle Cloud Infrastructure, Chris Hoina, Senior Product Manager, Database Tools, ORDS
    Also: You should sign up for an 
    Always Free OCI account.

    Moving on…I also want to use Database Actions, and the REST API features provided by ORDS. Not to mention, from everything I’ve seen in the Flask tutorials you have to first create database models before you can do anything really meaningful. I’m not entirely sure how to go about that (please check back in 2023).

    ORDS is straightforward – just give me the URI for the API endpoint and I’ll take it from there. If I need something additional we can modify the API accordingly.

    API GET Request response (via ORDS)

    Speaking of which, here is what the JSON looks like from my ORDS-powered API GET request:

    ORDS REST API response from Autonomous Database, Chris Hoina, Senior Product Manager, Database Tools, ORDS

    I won’t spend too much time explaining this image (you can read about it in detail, here), but I’m most interested in the city: (found under items). There are 18 cities, and I’ll need all of them for Flask. More specifically, I need them for WTForms.

    About WTForms

    I won’t give a crash course on WTForms, but bottom line it’s another way to extend the capabilities of Flask.

    WTForms is a flexible forms validation and rendering library for Python web development. It can work with whatever web framework and template engine you choose. It supports data validation, CSRF protection, internationalization (I18N), and more.

    WTForms Docs

    In my scenario, WTForms allows me to easily make and handle (as in routing) forms in the Flask application. More specifically, there is a SelectFields() form that I need to use (this is just how I initially designed the app, you can review it here).

    Choices

    The forms have fields. Here are details on the form I’m most interested in:

    WTForms Flask Python ORDS REST APIs Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools

    You’ll notice there are various parameters I can include, but all I care about are the choices[]. You can either hard (static) code in a list of choices, or (as I just successfully did as of this week), pass a list from an API GET request to the choices[] parameter.

    If you’ve looked at the documentation, then you’ve seen that choices[] can be”passed” in one of three ways:

    1. a list of (value, label) pairs (it can also be a list of only values, in which case the value is used as the label)
    2. a dictionary of {label: list} pairs defining groupings of options, or
    3. a function taking no argument, and returning either a list or a dictionary

    My approach used the first option. Actually, the values-only approach, in my case.

    The breakdown

    First, I’ll show the dropdown list; it will look like this in my application (FYI this actually works):

    Flask application in action

    I know that video is fast, but you should be able to see the dropdown list (at around the :15-:17 mark) and how it renders.

    This dropdown list is made possible from an initial GET request to my ORDS-enabled “Cities” table. And the code in its entirety (including the Flask bits), looks like this:

    Flask WTForms Python ORDS REST APIs Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools
    NOTE: I'm working off the assumption that you may have some familiarity with flask and WTForms. And also that your focus is the REST API - to - choices part of this problem. 

    By the numbers

    Okay, a quick overview:

    • Lines 27-29: This is pretty standard for Flask, no explanation is needed
    • Lines 31-33: Are where I build out the form:
      • You’ll see that city is the field (to select from)
      • The dropdown will have a heading that says, “Let’s begin by choosing a city” (you can see this in the video)
      • And the choices will be a list []
      • Submit on line 33 is just how you’d create the submit button
    • Lines 37-39: Are also pretty standard for any GET request using the Requests library in python
    NOTE: But here, you'll see that my URL is actually the URI provided by ORDS (remember, I REST-enabled my cities table - that is the "endpoint" we are seeing) 
    • Line 40: I make a list (it’s empty at this step, so sad)
    • Lines 41-43: This is the pièce de résistance.
      • Here we perform a loop through the items: in the JSON response, and since this is a nested array, you have to specify what array you want and what exactly in that array you want to pull out (in this case the “city”)
      • Line 43 was a suggestion by thatjeffsmith, to display the list in alphabetical order (I must confess, it does better)
    • Lines 45-48: These are pretty standard for Flask as well. You need to set city (in line 32) equal to the city_list so WTForms knows where the SelectField(choices=[]) comes from

    The Code

    With the brief explanation out of the way, here is the code. Please, help yourself:

    #For the GET request portion, you'll at a minimum need requests and json. If you are working in Flask already, then you know the rest. 
    import requests
    import json 
    from wtforms import SelectField, SubmitField
    from flask import Flask, json, jsonify, render_template, request
    from flask_wtf import FlaskForm
    
    #Creating the Flask instance.
    app = Flask(__name__)
    app.config['SECRET_KEY'] = 'password1234'
    
    #Creating the Form.
    class CitiesForm(FlaskForm):
      city = SelectField("Let's begin by choosing a city", choices=[])
      submit = SubmitField('Submit')
    
    #Routes, used for displaying the html on your web app.
    @app.route('/', methods = ['GET', 'POST'])
    def index():
    
    #This is the section where you perform the GET request to retrieve the JSON from your ORDS endpoint.
        URL = '[your ORDS endpoint]'
        response = requests.get(URL)
        r = json.loads(response.text)
    
    #Creating the list of cities; which will be populated shortly.
    city_list = []
    
    #Looping through the nested array (from the inital JSON / API response). 
        for nestedArr in r['items']:
          city_list.append(nestedArr['city'])
          city_list.sort()
    
    #Finishing touches so WTForms, and Flask know where to look. And also setting up how/where the html on your web application renders.
    
        city=city_list
        form = CitiesForm()
        form.city.choices = city_list 
        return render_template('index.html', form=form)

    Obviously, you’ll want to change up some of the names. But the ORDS pieces should remain pretty similar.

    Next up

    My next steps are to create a second and third dependent dropdown for the app. What I mean is that, based on a user’s city selection, that city would then be used as a query parameter for a subsequent GET request (to my ADB). And ORDS would respond with a list that includes restaurant information for the target city (everything else would be excluded).

    If you’ve made it this far and you have any questions, please send me a note.

    And if you think this is worth sharing, and that it might help others, please pass it around!

    Cash me outside

    As always, you can find me on:

  • Clean up a .CSV file with Regular Expressions, Pandas, and Python

    Clean up a .CSV file with Regular Expressions, Pandas, and Python

    Let us begin

    I log into Database Actions as my newly created “Python Developer” and navigate directly to the “Data Load” page (found under the Data Tools section of the Launchpad). I choose to “Load Data” from a “Local File.” I click “next,” click the pencil icon (see the arrow in the image), and navigate to the “File” tab. I scroll to the “RESTAURANTOPENDATE” column and see this:

    CSV file in data load, Database Actions, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    In a previous post (if you’re new, you can read the end-to-end process), I discussed how this time (“04:00:00+00” or “05:00:00+00”) wasn’t necessary for me. At that time, I used the “Find and Replace” function in Excel (I’m sure you can do the same with Numbers, Sheets, or Calc) to replace all occurrences of time with “” (i.e., nothing).

    But in the spirit of doing things in five days, when they could have taken but five minutes, I opted to see if I could achieve a similar result in python.

    Goal

    Create a python script that will allow you to remove part of a value in a .CSV file.

    WARNING: I thought this would be a simple task. I should have known better. My approach may not be ideal for your situation, but hopefully you’ll learn something. Or at the very least maybe you can bookmark this post, along with the resources (at the end of the post) I'm including for later use.

    Regular Expressions

    I am confident there is a way to achieve this goal with the .CSV library in python. There is probably a way to do this with python out of the box. I couldn’t figure it out.

    I’m also reasonably confident that my approach is on the verge of ridiculous. Nevertheless, Regular Expressions, and the Pandas library, in python are what worked for me.

    What are Regular Expressions?

    Good question. I still don’t know, but here is what I found on Wikipedia:

    “A regular expression (shortened as regex or regexp; also referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually, such patterns are used by string-searching algorithms for “find” or “find and replace” operations on strings, or for input validation. It is a technique developed in theoretical computer science and formal language theory.”

    That is so painful to read, but the way I understand it is that we use Regular Expressions in pattern-matching. Essentially you create a pattern and then tell your application (or script) to search for it. From there, you can include more code to perform more actions. In my case, those actions would be to find a pattern and replace that pattern with nothing.

    So what is the pattern?

    One of the benefits of having zero formal training in application development (in this case, computer science and formal language theory) is that occasionally, you might take an approach that, while unintuitive, works well enough.

    And after many, many hours of trial and error, parsing through Stack Overflow, reviewing hours of YouTube, reading pages of blogs, and occasional use of the “I’m Feeling Lucky” button on Google, it occurred to me that my pattern was:

    Pattern for Regular Expressions, Database Actions, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    After reviewing more documentation and other various resources (I have an entire section at the end), I more clearly identified a pattern:

    A more formal pattern for Regular Expressions, Database Actions, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    Coming up with a clearly-defined pattern helped inform me as to how I could create the following Regular Expression:

    '\s\d+:\d+:\d+[^a-c6]\d+$'

    I then did some more stuff, and that was it! Follow me for more…

    I’m kidding.

    Deep Dive

    I can’t teach you everything, because I’m probably only ahead of you by a week at this point. But I can explain, in the hopes that it will make sense. Let me continue with the above Regular Expression (again, you’ll want to spend some time in the resources section I’ve included to understand better how this all fits together).

    But the above Regular Expression can be defined like this:

    Regular Expression definition for this script, Database Actions, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    I then took the above and plugged it into this nine (12 with sections, slightly more with comments) line script:

    from pickle import NONE, TRUE
    #This doesn't seem work without pickle, still not sure why
    import pandas as pd
    #You need this to do anything related to dataframes (i.e. "df"); 
    # if you're a data scientist and you use python, you may already 
    # be familiar with this
    import re 
    #You need this for Regular Expressions to work
    
    d = '/Users/choina/Documents/untitled folder/Restaurants_in_Wake_County.csv'
    #I'm assigning the file to "d"
    df = pd.read_csv(d)
    #df means dataframe. Here I'm telling panda to read the .CSV file 
    # that I just created ("d"), and consider that the "dataframe".
    
    print(df.dtypes)
    #This is optional, but when I do this I can see what the various 
    # datatypes are (according to pandas). The RESTAURANTOPENDATE column 
    # is an 'object'.
    for RESTAURANTOPENDATE in df:
        df['RESTAURANTOPENDATE'] = df['RESTAURANTOPENDATE'].str.replace('\s\d+:\d+:\d+[^a-c6]\d+$', '', regex=TRUE)
    #This is a "For Loop" that says set the RESTAURANTOPENDATE column 
    # contents equal to the following: FIRST, consider the values in the 
    # column as string and replace the contents using this process: 
    # find a pattern that matches this Regular Expression, replace
    # it with NOTHING, then take that column,along with the other 
    # columns and...
    df.to_csv('newrest.csv')
    #Save it to a new .CSV file called "newrest.csv"

    Code as an image for reference:

    Python script to clean CSV file for data load, Database Actions, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    In short, (I’ve left detailed comments for reference), these nine lines of code search for a pattern and replace it with nothing when found. We then take these results (including the rest of the .CSV file) and save them to a new file called “newrest.csv.”

    Please, reference this script. Tweak it as needed. Or drop me a note if you need an outside perspective. Good luck!

    One more thing

    Pickle, Pandas, and RE are all required for this script to work.

    Note: While I understand Pickle is a library that deals with the serialization of objects in python. I've no idea what that means, and reading about it makes my brain melt. Additionally, I’m not sure if this is expected behavior, but when I ran an earlier version of this script, this "Pickles" library just appeared at the very top. If this has happened to you, please leave a comment, because I'm scared there may be an actual ghost in my machine.

    Moving on

    The rest is pretty straightforward.

    I then went back into Database Actions (it had been so long that I’d expected a new version had been released already), loaded my data, and inspected the column in question:

    Updated CSV file in data load, Database Actions, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    Looks good to me!

    Magic

    Something I noticed was that my Autonomous Database automatically recognizes the “RESTAURANTOPENDATE” column as a data type of “Date.” I didn’t have to do anything special here, and I thought this was pretty cool. Take a look:

    Database Actions automatically recognizes datatypes Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    Addendum to the addendum

    Once imported, you can also use Regular Expressions in your database (with SQL). But if you’d like to clean your data before it reaches your ADB, the approach I’m writing about here is also a great option!

    Resources

    As promised, here are all the resources I’ve curated that might assist you should you choose to take the road less traveled…

    • Important/relevant documentation:
    • Stack Oveflow – this is a series of “bookmarked” questions about Python and Regular Expressions, that helped me. They might be of use. They aren’t intended to teach you, but rather aide you in gaining a better perspective.
    • YouTube – a playlist I created that might be worth reviewing (at a minimum watch Case Digital’s “How to Remove Characters From a Pandas Dataframe In Python” video).
    • Regular Expression Editors:
      • Pythex (make sure you click the “Regular expression cheatsheet” button to expand/see a valuable quick reference – also, apparently a more complete version can be found here)
      • Regular Expressions 101 – this includes an enormous library of previously created Regular Expressions. There is a “debugger” too!

    An even simpler solution

    I discovered an even simpler solution to this problem. As I was reviewing the Pandas documentation I noticed there was functionality built directly into Pandas. This can be done with the “datatime” function.

    And you can take it one step further with what is referred to as a “.dt accessor.” (you’ll see this in the updated code I’m including). This accessor allows you to manipulate a Series (aka a column) within a Pandas dataframe (df).

    There are quite a few time-date properties you can manipulate in Pandas, all of which can be found here. The “Time Zone Operations” section is where the “.dt accessor”, is briefly mentioned. For more context, I’d recommend reviewing this section on the .dt accessor and how they interplay with a Pandas Series.

    Don’t stop there though, the main python documentation discusses the basics of date and time operations. Without further ado, on to the new code:

    import pandas as pd 
    import numpy as np 
    
    #You don't have to do it like this. You could have a single "d" and then just comment/uncomment each line as needed. 
    
    d1 = '/Users/choina/Downloads/Food_Inspections.csv'
    d2 = '/Users/choina/Downloads/Food_Inspection_Violations copy.csv'
    d3 = '/Users/choina/Downloads/Restaurants_in_Wake_County copy.csv'
    
    #The same thing here, you could just have a single "d" and then just comment out the other two, when not needed.
    
    df_1 = pd.read_csv(d1)
    df_2 = pd.read_csv(d2)
    df_3 = pd.read_csv(d3)
    
    #Same goes for this too. This is either very slick of me, or incredibly lazy. 
    
    df_1['DATE_'] = pd.to_datetime(df_1['DATE_']).dt.date
    df_2['INSPECTDATE'] = pd.to_datetime(df_2['INSPECTDATE']).dt.date
    df_3['RESTAURANTOPENDATE'] = pd.to_datetime(df_3['RESTAURANTOPENDATE']).dt.date
    
    #Same, same here as well. You could do one at a time. But it works, I double-checked my work. 
    
    df_1.to_csv('newninspect.csv')
    df_2.to_csv('newviolate.csv')
    df_3.to_csv('newrest.csv')
    New python script to change date time in csv files, Chris Hoina, Senior Product Manager, ORDS, Database Tools, Oracle Autonomous Database

    About the update

    This code is certified fresh – meaning, it works. I triple-checked. The only thing I’ve noticed is that lines 12-14 must-have “.dt.date” at the end. From what I am reading (and am inferring from the documentation), it looks as though you need to first “treat” the Series with the “to_datatime” function. After that, the entire dataframe is in limbo (not quite written out to a new .CSV), waiting. Before the next step, we can strip the time portion out using the .dt accessor (i.e. the “.dt.date” portion located at the end of lines 12-14).

    From there it is simply a matter of writing out these updated dataframes to the three new expectant .CSV files.

    Find Me

    Please come back for more. I’m going to be working on views in my Autonomous Database, and then later, I’ll REST-enable them with the help of ORDS!

  • GET requests with Python, ORDS, and the Oracle Autonomous Database

    GET requests with Python, ORDS, and the Oracle Autonomous Database

    Problem

    I wanted to take the Objects,

    Objects in SQL Worksheet in Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,
    Objects

    In my Admin schema and copy them into a newly created Python Developer database user/schema. I have been meaning to do this for a while now since I’d to approach all future development work as a typical [developer] user might.

    I previously created a “developer” user, but the recommendation was to create a user specifically for python development. This recommendation makes sense because I’d like to explore other developer frameworks — JavaScript/React, Derelicte, Golang, Hoobastank, Swift, Smashmouth.

    DISCLAIMER: Some of the aforementioned languages/frameworks are entirely fictional.

    So in creating this user and populating the schema, I could have just copied these objects over in seconds. I’m sure some already existing would have enabled me to do this. A less-elegant option would be to do a local upload. I’m confident I could export the tables of my Admin schema and then upload them as a developer in Database Actions. Doing so here:

    Data Load in Database Actions Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools

    But those options were too straightforward, effortless, and expeditious. I needed an approach that would make me question my life decisions. Something that would force me to rethink what it meant to be human.

    Mostly, I wanted an excuse to tinker with ORDS some more. While I didn’t start with a well-defined use case, I was able to retcon (a portmanteau of the words retroactive and continuity) one.

    Scenario

    The scenario is that we have REST-enabled schema objects (in this case, various tables), and maybe we want to execute GET requests on that table. Perhaps we want all the data from that table (having been returned in JSON). And for whatever reason, we want to save this information in a JSON file. Plausible? Not likely. Possible? If I can do it, then yes, 100% possible.

    Solution

    NOTE: I've since disabled REST for these tables. So none of these URLs currently work. Also for security reasons you probably want to a require authentication to access these. The below screenshots and code snippets are for demonstration purposes.  In this case, probably acceptable, but please (PLEASE) exercise caution if you are recreating at your workplace.

    REST-enabling

    When you REST-enable a table, you’re provided a URL (aka a REST Endpoint) for viewing. It will look something like this:

    https://gf641ea24ecc468-dbmcdeebyface.adb.us-ashburn-1.oraclecloudapps.com/ords/admin/scores/

    The response looks like this:

    First 25 results via ORDS Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    You are seeing the first 25 results from my “Scores” table. I’ve also created an alias name for that table as an extra level of protection. The table’s actual name is not “Scores,” but rather something else.

    If you collapse the “items,” it’s easier to see the rest of the “name: value” pairs:

    Collapsed results via ORDS Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    I can do my best to explain what we see here:

    • ‘items: […]” are the individual entries in my table (the columns would be the items you see in the black bold). ORDS includes links for calling that single entry only
    • ‘hasMore”: true,” essentially means that there are more pages with information to follow
    • “limit: 25” refers to the fact that ORDS is only returning 25 results at a time. I didn’t set this; it was the default.
    • “offset: 0” this is tricky. From what I can tell, the offset only becomes a consideration after the first page. You begin with no offset (this makes no sense, I know, but I will explain this shortly)
    • “count: 25” – we have entries 1-25 on this page
    • “links: […]” these are all pretty self-explanatory. Two links, in particular, are pretty interesting. They are:
      • “describedby” and
      • “next”

    If I click the “describedby” link, you’ll see something that looks like this:

    Described by URL ORDS Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    On this page, you can review valuable information without ever logging into Database Actions. I found this extremely helpful; from here, I could see the table name, Primary Key (if it exists), and the column names and types for the table.

    Another link I wanted to highlight was “next.” If you look closely, you’ll see that the “next” URL includes an offset of zero:

    Initial 25 results via ORDS Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    However, if I click that URL (I’ve collapsed all the items on this page as well), I can see that “offset” has increased to 25, and the new “next” URL shows “offset=50”. This next part will do a better job of explaining what is happening.

    Next 25 results via ORDS Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    Since we are on the second page of results, ORDS knows that we began with an offset of zero (the beginning of the table) and a limit of 25. Each “next” URL will increment by 25 (the offset) and return 25 results at a time. And it will keep producing “next” URLs until the “hasMore: false” condition is finally satisfied. We can modify a lot of this, but this is what ORDS does for you automatically/by default. Great news for me because at my current level of experience, the more steps that I can automate, the better!

    Python

    Prepare yourself for a bit of a leap. At this point, I was pretty confident I could copy/paste my way through some python code and pull down this data, and then save it in some format.

    Finally, after about a week, I was able to duct-tape something together. Courtesy of Google, StackOverflow, so many YouTube video tutorials (so, so many), here is what I’ve arrived at:

    import requests
    import json 
    
    # requests is a python library that allows you to send GET, 
    # PUT, DELETE, HEAD, and OPTIONS requests. I'm just doing 
    # GET requests though. The json library comes with python, 
    # but I need it here to be able to work with the json that # is returned. I'll also need it to create the json file 
    # (at the end).
    
    offset = 0
    
    # I'm basically telling this little script that we should 
    # expect to start with an offset of zero. If you recall, 
    # this just means we are beginning on the first page. 
    # Hence, there is no offset. 
    
    url = 'https://gf641ea24ecc468-dbmcdeebyface.adb.us-ashburn-1.oraclecloudapps.com/ords/admin/places/?offset={}'
    
    # I've added in this modifier {}, which, as you'll see in a # bit, allows me to increment by 25 with each subsequent 
    # GET request.
    
    hasMore = True
    
    # Read this as "hasMore is equal to True". In other words, 
    # if there is more STUFF, then we should see "true" (go 
    # back and look at the screenshots if you don't know what 
    # this means.)
    
    while hasMore:
    
    # We are saying, "While 'hasMore is equal to true, then..."
    
    response = requests.get(url.format(offset))
        
    #...recognize that "response" is a GET request that 
    # consists of the URL (from above) plus whatever the offset # is (we'll indicate the offset below).
    
    data = response.json()
    
    #data is just whatever the results are from the above 
    # response.
    
    filename = 'places.json'
    
    # "filename" is the json file we are going to create to 
    # put all these results into.
    
    if data.get('hasMore'):
         offset += 25
         with open(filename, 'a') as json_file:
              json.dump(data, json_file)
    else:
         break 
    
    # I'm going to switch to narrating here, this one is just 
    # too verbose. In lay terms, we are saying: if the GET 
    # request "data" (which is essentially a composite variable # at this point) has "hasMore: true" on the page, then keep # doing this next thing. 
    
    # With the next "thing" being: take the results from that 
    # URL and save it into a file called "places.json". Once 
    # you've done that, keep doing it, but add 25 to the offset # for that URL for subsequent GET requests. And whenever 
    # you have results, append it (that's what that "a" is on 
    # line 44) to the file you created, and just keep going 
    # until "hasMore" doesn't satisfy that condition on line 
    # 20. 
    
    # If it doesn't satisfy that condition then that means 
    # "hasMore" is equal to false. In other words, there is 
    # nothing more to request, so stop (that is what the "else: # break" means). 

    I’m going to include an image to see the nesting better.

    python script for ORDS, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    And once I run this, I’m left with a JSON file (“places.json”) that has all my entries saved à la:

    json file output from ORDS, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    Now what?

    At this point, I have a few options. I could:

    • include code that would convert this JSON file to a . CSV file
    • include code to indent the JSON file, making it more readable, or
    • drag and drop it into my Autonomous Database (via Data Load in Database Actions) with some slight tweaking (I just tried, so I already know I need to do more massaging)

    The End

    So what have we learned here? Even somebody with very basic skills can interact with an Autonomous Database via ORDS.

    If you have an internet connection and are eager enough, the sky is the limit. Putting things into perspective, I wouldn’t have been able to do this about a week and a half ago.

    Truthfully, you’re not that far behind! I encourage you to give it a go.

    Find me

  • Updating VARCHAR2 fields in Oracle Database Actions

    Updating VARCHAR2 fields in Oracle Database Actions

    Update

    I’ve run into more problems with this python/flask/ords/adb app idea of mine. Turns out I’m too [developer] dumb to do some of the things that I’d originally envisioned. Now that I’ve become more familiar with python, flask, the Oracle Autonomous Database, and our Database Tools (like ORDS and Database Actions), I’m realizing that I need to rethink my approach.

    First off, I’ve updated the (not trying to be pretentious here) User Journey for this project.

    SIDE NOTE: The User Journey is something that I really should pay more attention to (especially considering it was one of my concentrations in my MBA program, and it was a huge initiative in my previous role at Big Blue) and you should too!

    Ch-ch-ch-ch-changes

    I’m leaning towards (after a recent discussion with that jeff smith) more of a drop-down style of search. I’d like to present a user with all the available cities (having taken these from my “Restaurants” table) in Wake County.

    I think I can populate a list of cities to choose from with the help of REST APIs (via ORDS). But I need to do some clean-up first.

    Side note

    In about a minute, you’ll probably come to the realization that my approach is less than elegant. Remember, I’m not a developer. I’m kind of stumbling through this, and I’ll need to make some sacrifices along the way (we call those “trade-offs” in business-speak). Unfortunately, there are simply some techniques/skills that I do not know yet. For a visual representation of these deficits, I draw your attention to:

    Me the develper versus an actual developer, Chris Hoina, Senior Product Manager, ORDS, Database Tools, Oracle Autonomous Database, Database Actions

    Back to your regularly scheduled program

    On to clean-up. The cities in this table are a bit all over the place. Let’s take a look.

    From Database Actions, I can navigate to a SQL Worksheet. From there I can inspect my tables (Right-click > Edit), or jump straight into the SQL. Since I know my “CITIES” column is the problem, I’ll run this:

    SELECT DISTINCT CITY
    FROM ADMIN.RESTAURANTS
    ORDER BY 1

    Some screenshots of what this looks like in Database Actions:

    NOTE: The "DISTINCT" parameter drops any duplicates, "ORDER BY" returns results in ascending order.

    I suspect that the program used to input health inspections allows for some level of free text.

    Tangent / Story

    I’ve used an application similar to this in a previous life, but it was a Computer Aided Dispatch (CAD) application. You could enter in free-text, or select from users’ previous entries. There was a lot of flexibility, but because of that, there were many variations on things like street names, cities, locations, parking lots, person names, etc. For the user, awesome, for the developer or database administrator, less so.

    At this point, it is pretty clear the amount of normalization I’ll have to do to all these city names. Now I just need to figure out:

    • case to use (Upper, Sentence, Lower, SaRCaSTic, etc.)
    • city name variations to use

    I randomly chose a city with the following SQL command:

    UPDATE Restaurants SET CITY = 'Holly Springs'
    WHERE UPPER(CITY) = 'HOLLY SPRING'

    In this example, I searched for all instances of “Holly Spring” and replaced them with the correct “Holly Springs”. I can then repeat the process for all other cities. Again, the abridged walk-through of this is here.

    I actually know this data well, since I grew up in the area. So deciding on the correct city names shouldn’t be too challenging. But familiarizing yourself with your own data might not be such a bad idea. For instance, in this table, one of the cities displays as “Research Triangle Park”. I don’t think it’s common to refer to it like this; we usually just call it “RTP”. I think little details like that really help to elevate your application.

    Next steps

    I’ll keep this brief. But what I think I’m going to have to do next is create a REST API based on this Restaurant table. I should have a better idea in a few days. But the idea is, that once a user selects a city, it will pass that city as a parameter to a GET request (again, courtesy of ORDS) and then return additional data. A user would then be able to further filter on that returned data, eventually ending on their target restaurant.

    Okay, stay tuned…as I feel like I’m on the precipice of something exciting!

    Catch me if you can


    Abridged walk-through

    Select Worksheets in Database Actions Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools

    From Database Actions, navigate to the “SQL” feature. You’ll see this screen appear.

    NOTE: I had a previous SQL Worksheet saved, but you can enter SQL statements directly into the default worksheet. 
    Before city field has been updated in Database Actions Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools

    *If you expand, you can see the statement that I used.

    SELECT DISTINCT CITY
    FROM ADMIN.RESTAURANTS
    ORDER BY 1

    This will produce a list of all the cities in their current form. Notice the four different variations of “Fuquay-Varina”.

    Updating fields in City Column Database Actions Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools

    I’ll comment out lines 1-3 and enter the following SQL statement to update Fuquay-Varina:

    UPDATE Admin.Restaurants SET CITY = 'Fuquay-Varina'
    WHERE UPPER(CITY) = 'FUQUAY VARINA'

    I use the “DISTINCT” parameter to drop any duplicates in the results. “ORDER BY” returns results in ascending order.

    You’ll see that 77 rows have been updated.

    After Fuquay Varina has been updated in Database Actions Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools

    I’ll un-comment lines 1-3, and comment lines 5-6.

    Based on the results, we’ve reduced the variations of “Fuquay-Varina”. I’ll continue this work until I have a smaller set of more consistent cities.