Tag: Database

  • Python and the Oracle Autonomous Database: Three Ways to Connect

    Python and the Oracle Autonomous Database: Three Ways to Connect

    Watch the deep dive videos:

    Part I

    Overview and connecting with the python-oracledb library

    Part II

    Connecting with Oracle REST APIs unauthenticated

    Part III

    Custom Oracle REST APIs with OAuth2.0 Authorization

    Welcome back

    I finally had a break in my PM duties to share a small afternoon project [I started a few weeks ago]. I challenged myself to a brief Python coding exercise. I wanted to develop some code that allowed me to connect to my Autonomous Database using either our python-oracledb driver (library) or with Oracle REST Data Services (ORDS).

    I undertook this effort as I also wanted to make some comparisons and maybe draw some conclusions from these different approaches.

    NOTE: If you don't feel like reading this drivel, you can jump straight to the repository where this code lives. It's all nicely commented and has everything you need to get it to work. You can check that out here. 

    The test files

    Reviewing the code, I’ve created three Python test files. test1.py relies on the python-oracledb library to connect to an Oracle Autonomous database while test2.py and test3.py rely on ORDS (test3.py uses OAuth2.0, but more on that later).

    Configuration

    Configuration directory

    I set up this configuration directory (config_dir) to abstract sensitive information from the test files. My ewallet.pem and tnsnames.ora files live in this config_dir. These are both required for Mutual TLS (mTLS) connection to an Oracle Autonomous database (you can find additional details on mTLS in the docs here).

    config directory files first test case, chris hoina, senior product manager, oracle rest apis, database tools
    ewallet.pem and tnsnames.ora files

    Other files

    oauth testyurls and wallet creds files for tests, chris hoina, senior product manager, ords, oracle rest apis, database tools, sqlcl
    OAuth2.0, Test URLs, and Wallet Credential files

    Other files include oauth2creds.py, testurls.py, and walletcredentials.py. Depending on the test case, I’ll use some or all of these files (you’ll see that shortly).

    NOTE: If not obvious to you, I wouldn't put any sensitive information into a public git repository. 

    Connecting with python-oracledb

    One approach to connecting via your Oracle database is with the python-oracledb driver (library). An Oracle team created this library (people much more experienced and wiser than me), and it makes connecting with Python possible.

    FYI: I’m connecting to my Autonomous Database. If you want to try this, refer to the documentation for using this library and the Autonomous database. You can find that here.

    The Python code that I came up with to make this work:

    #Connecting to an Oracle Autonomous Database using the Python-OracleDB driver.
    
    import oracledb
    
    # A separate python file I created and later import here. It contains my credentials, so as not to show them in this script here. 
    
    from walletcredentials import uname, pwd, cdir, wltloc, wltpwd, dsn
    
    # Requires a config directory with ewallet.pem and tnsnames.ora files.
    
    with oracledb.connect(user=uname, password=pwd, dsn=dsn, config_dir=cdir, wallet_location=wltloc, wallet_password=wltpwd) as connection:
        with connection.cursor() as cursor:
    
    # SQL statements should not contain a trailing semicolon (“;”) or forward slash (“/”).
    
            sql = """select * from BUSCONFIND where location='ZAF'
            order by value ASC """
            for r in cursor.execute(sql):
                print(r)

    In Line 7, you can see how I import the wallet credentials from the walletcredentials.py file. Without that information, this code wouldn’t work. I also import the database username, password, and configuration directory (which includes the ewallet.pem and tnsnames.ora files).

    From there, the code is pretty straightforward. However, some library-specific syntax is required (the complete details are in the docs, found here), but aside from that, nothing is too complicated. You’ll see the SQL statement in Lines 16-17; the proper SQL format looks like this:

    SELECT  * FROM busconfind WHERE location='zaf'
    ORDER BY value ASC;

    And here is an example of this SQL output in a SQL Worksheet (in Database Actions):

    testing sql in database actions for python-oracledb driver, chris hoina, senior product manager, oracle rest apis, ords, python, autonomous database
    Reviewing the SQL in Database Actions
    FYI: This is a Business Confidence Index data-set, in case you were curious (retrieved here). 

    That SQL allows me to filter on a Location and then return those results in ascending order according to the Value column. When I do this using the python-oracledb driver, I should expect to see the same results.

    NOTE: You've probably noticed that the SQL in the python file differs from that seen in the SQL Worksheet. That is because you need to escape the single quotes surrounding ZAF, as well as remove the trailing semi-colon in the SQL statement. Its all in the python-oracledb documentation, you just have to be aware of this.

    Once I have all the necessary information in my walletcredentials.py file, I can import that into the test1.py file and execute the code. I chose to run this in an Interactive Window (I’m using VS Code), but you can also do this in your Terminal. In the images (from left to right), you’ll see the test1.py file, then a summary of the output from that SQL query (contained in the test1.py code), and finally, the detailed output (in a text editor).

    Wrap-up

    For those that have an existing Free Tier tenancy, this could be a good option for you. Of course, you have to do some light administration. But if you have gone through the steps to create an Autonomous database in your cloud tenancy, you probably know where to look for the tnsnames.ora and other database wallet files.

    I’m not a developer, but I think it would be nice to simplify the business logic found in this Python code. Maybe better to abstract it completely. For prototyping an application (perhaps one that isn’t micro services-oriented, this could work) or for data- and business analysts, this could do the trick for you. In fact, the data is returned to you in rows of tuples; so turning this into a CSV or reading it into a data analysis library (such as pandas) should be fairly easy!

    Connecting via ORDS: sans OAuth2.0

    Auto-REST and cURL

    I’m still using the “devuser” (although this may be unnecessary, as any unsecured REST-enabled table would do). I’m using the same table as before; the only change I’ve made is to auto-REST enable the BUSCONFIND table for the test2.py code.

    In the following images, I’m retrieving the cURL command for performing a GET request on this table.

    NOTE: In a recent ORDS update, we made available different shell variations (this will depend on your OS); I've selected Bash.

    From there, I take the URI (learn more on URIs) portion of the cURL command and place it into my browser. Since this table is auto-REST enabled, I’ll only receive 25 rows from this table.

    NOTE: The ORDS default pagination is limit = 25.

    The code

    And the code for this test2.py looks like this:

    # Auto-REST enabled with ORDS; in an Oracle Autonomous Database with query parameters.
    
    import requests
    import pprint
    
    # Importing the base URI from this python file.
    
    from testurls import test2_url
    
    # An unprotected endpoint that has been "switched on" with the ORDS Auto-REST enable feature. 
    # Query parameters can be added/passed to the Base URI for GET-ing more discrete information.
    
    url = (test2_url + '?q={"location":"ZAF","value":{"$gt":100},"$orderby":{"value":"asc"}}}')
    
    # For prototyping an application, in its earlier stages, this could really work. On your front end, you 
    # expect the user to make certain selections, and you'll still pass those as parameters. 
    # But here, you do this as a query string. In later stages, you may want to streamline your application
    # code by placing all this into a PL/SQL or SQL statement. Thereby separating application
    # logic and business logic. You'll see this approach in the test3.py file. 
    # This works, but you can see how it gets verbose, quick. Its a great jumping-off point.
    
    responsefromadb = requests.get(url)
    pprint.pprint(responsefromadb.json())

    Lines 8 and 13 are the two areas to focus on in this example. In Line 8 imported my URL from the testurls.py file (again, abstracting it, so it’s not in the main body of the code).

    test2 python file and testurls file for second test, chris hoina, senior product manager, oracle rest apis, ords, python, autonomous database
    The test2.py and testurls.py files

    And then, in Line 13, I appended a query string to the end of that URL. ORDS expects the query parameters to be a JSON object with the following syntax:

    [ORDS Endpoint]/?q={"JSON Key": "JSON Value"}

    The new, complete query string below requests the same information as was requested in the test1.py example:

    url = (test2_url + '?q={"location":"ZAF","value":{"$gt":100},"$orderby":{"value":"asc"}}}')

    This string begins with that same BASE URI for the ORDS endpoint (the auto-REST enabled BUSCONFIND table) and then applies the query string prefix “?q=” followed by the following parameters:

    1. Filter by the location "ZAF"
    2. Limit the search of these locations to values (in the Value column) greater than ($gt) 100
    3. Return these results in ascending order (asc) of the Value column
    NOTE: You can manipulate the offsets and limits in the python-oracledb driver too. More info found here. And filtering in queries with ORDS can be found here. 

    And if I run the test2.py code in the VS Code Interactive Window, I’ll see the following summary output.

    summary output from python test2, chris hoina, senior product manager, oracle rest apis, ords, python, autonomous database
    Summary output from the response in test2.py

    Here is a more detailed view in the VS Code text editor:

    detailed output from python test2, chris hoina, senior product manager, oracle rest apis, ords, python, autonomous database
    Detailed output with helpful links

    Wrap-up

    A slightly different approach, right? The data is all there, similar to what you saw in the test1.py example. There are a few things to note, though:

    • The consumer of this ORDS REST API doesn’t need access to the database (i.e. you don’t need to be an admin or have a schema); you can perform GET requests on this URI.
    • The response body is in JSON (ubiquitous across the web and web applications)
      • Also, language and framework agnostic (the JSON can be consumed/used widely, and not just with Python)
    • You are provided a URI for each item (i.e. entry, row, etc.)
    • No need for SQL; just filter with the JSON query parameters
    • No business logic in the application code
    • Needless to say, no ORMs or database modeling is required for this approach

    However…security is, ahem…nonexistent. That is a problem and flies in the face of what we recommend in our ORDS Best Practices.

    Connecting via ORDS: secured with OAuth2

    Note: This is an abbreviated explanation, I'll be posting an expanded write-up on this example post haste!

    Since this is what I’m considering “advanced” (it’s not difficult, there are just many pieces) I’m going to keep this section brief. Long story short, I’ll take those query parameters from above and place them into what is referred to as a Resource Handler.

    TIME-OUT: Auto-REST enabling a database object (the BUSCONFIND table in this case) is simple in Database Actions. Its a simple left-click > REST-enable. You saw that in the previous example. You are provided an endpoint and you can use the query parameters (i.e. the JSON {key: value} pairs) to access whatever you need from that object. 
    
    However, creating a custom ORDS REST endpoint is a little different. First you create a Resource Module, next a (or many) Resource Template/s, and then a (or many) Resource Handler/s. In that Resource Handler, you'll find the related business logic code for that particular HTTP operation (the menu includes: GET, POST, PUT, and DELETE). 

    The Resource Module

    The process of creating a custom ORDS API might be difficult to visualize, so I’ll include the steps I took along with a sample query (in that Resource Handler) to help illustrate.

    Chances are you may be the administrator of your Always Free tenancy, so you have full control over this. Other times, you might be provided the REST endpoint. In that case, you may not ever have to worry about these steps. Either way, you can see how we’re simulating (as well as both abstracting and keeping the business logic in the database) the query with this final example (test3.py).

    Security

    The OAuth 2.0 authorization framework enables a third-party application to obtain limited access to an HTTP service, either on behalf of a resource owner by orchestrating an approval interaction between the resource owner and the HTTP service, or by allowing the third-party application to obtain access on its own behalf.

    RFC 6749: The OAuth 2.0 Authorization Framework

    I’ll keep this section brief, but I’m protecting this resource through the aid of an ORDS OAuth2.0 client. I’ve created one here:

    creating an oauth2 client for test3 example, chris hoina, senior product manager, oracle rest apis, ords, python, autonomous database
    After creating a client you can use the provided URL for requesting a new Bearer Token

    And, as you’ll see shortly, I’ll rely on some Python libraries for requesting an Authorization Token to use with the related Client ID and Client Secret. If you want to nerd out on the OAuth2.0 framework, I challenge you to read this.

    test3.py example

    NOTE: Remember, I'm keeping this section intentionally brief. It deserves a slightly deeper dive, and class is almost over (so I'm running out of time). 

    The code for this example:

    # Custom ORDS Module in an Oracle Autonomous Database.
    
    import requests
    from requests_oauthlib import OAuth2Session
    from oauthlib.oauth2 import BackendApplicationClient
    import pprint
    import json
    
    # Importing the base URI from this python file.
    
    from testurls import test3_url
    
    # A separate python file I created and later import here. It contains my credentials, 
    # so as not to show them in this script here. 
    
    from oauth2creds import token_url, client_id, client_secret
    
    token_url = token_url
    client_id = client_id
    client_secret = client_secret
     
    client = BackendApplicationClient(client_id=client_id)
    oauth = OAuth2Session(client=client)
    
    token = oauth.fetch_token(token_url, client_id=client_id, client_secret=client_secret)
    
    bearer_token = token['access_token']
    
    # Location can be anything from the table. Now, only the single variable needs to be passed. Business logic has been abstracted somewhat; as it now resides within
    # ORDS. This could make your application more portable (to other languages and frameworks, since there are fewer idiosyncracies and dependencies):
    
    location = "ZAF"
    
    # print(location)
    
    # ------------------------------------------------------------------------------ # 
    # In Database Actions, we:
    #   1. Create an API Module
    #   2. Then create a Resource Template
    #   3. Finally, a GET Resource Handler that consists of the code from test1.py:
    
    #           select * from BUSCONFIND where location= :id
    #               order by value ASC
    # ------------------------------------------------------------------------------ # 
    url = (test3_url + location)
    # print(url)
    
    responsefromadb = requests.get(url, headers={'Authorization': 'Bearer ' + bearer_token}).json()
    
    # This step isn't necessary; it simply prints out the JSON response object in a more readable format.
    
    pprint.pprint(responsefromadb)

    Lines 11 and 16 deserve some attention here. The URL for Line 11 comes from the testurls.py file; seen in the previous example. And the contents from Line 16 come from the oauth2creds.py file. Here are the files, side-by-side:

    test3 python oauthcreds and test url files, chris hoina, senior product manager, oracle rest apis, ords, python, autonomous database
    The test3.py, testurls.py, and oauth2creds.py files

    As you can see in the testurls.py file, I’m relying on the test3_url for this example. And the OAuth2.0 information you see comes directly from the OAuth Client I created in Database Actions:

    creating an oauth2 client for test3 example, chris hoina, senior product manager, oracle rest apis, ords, python, autonomous database
    In this image, you can see the Client ID and Client Secret

    If I put that all together, I can execute the code in test3.py and “pretty print” the response in my Interactive Window. But first I need to adjust the Resource Handler’s URI (the one I copied and pasted from the “REST Workshop”). It retains the “:id” bind parameter. But the way I have this Python code set up, I need to remove it. It ends up going from this:

    test3_url = 'https://test3_url = 'https://abc123def456ghi789-darmok.adb.us-ashburn-1.oraclecloudapps.com/ords/devuser/myapi/locations/:id'

    To this:

    test3_url = 'https://abc123def456ghi789-darmok.adb.us-ashburn-1.oraclecloudapps.com/ords/devuser/myapi/locations/'

    Otherwise, if I don’t remove that bind parameter when the code executes, the target URL will end up looking like this:

    'https://abc123def456ghi789-darmok.adb.us-ashburn-1.oraclecloudapps.com/ords/devuser/myapi/locations/:idZAF'

    With that out of the way, I can run this code and review the output.

    From top-to-bottom, left-to-right you’ll see I first execute the code in the Interactive Window. From there I can review a summary of the response to my GET request. That pretty print library allows us to see the JSON array in a more readable format (one that has indentation and nesting); which you can see in the second image. The third image is a more detailed view of the first half of this response. And I include the final image to highlight the helpful URLs that are included in the response body.

    Since I know my limit = 25, and the 'hasMore': True (seen in the output in that third image) exists, I know there are more items. You can adjust the limit and offset in subsequent requests, but I’ll save that for another day.

    Wrap-up

    You can probably tell, but this is like an expansion of the previous example. But instead of relying on the auto-REST enabling, you are in full control of the Resource Module. And while you don’t need to use OAuth2.0 it’s good practice to use it for database authentication. You can see how the response comes through a little differently, compared to the previous example, but still very similar.

    In this example, I did all the work, but that might not be the case for you; much of it might be handled for you. The main thing I like about this example is that we rely on stable and popular Python libraries: requests, requests_oauthlib, and oautlib.

    The fact that this is delivered as a JSON object is helpful as well (for the same reasons mentioned in the second example). And finally, I enjoy the fact that you only need to pass a single parameter from your (assumed) presentation layer to your application layer; an example might be a selection from an HTML form or drop-down menu item.

    The end

    We’re at the end of this fun little exercise. As I mentioned before, I will expand on this third example. There are so many steps, and I think it would be helpful for people to see a more detailed walk-through.

    And be on the lookout (BOLO) for a video. There’s no way around this, but a video needs to accompany this post.

    And finally, you can find all the code I review in this post in my new “blogs” repository on GitHub. I encourage you to clone, fork, spoon, ladle, knife, etc…

    Oh and…

  • Podman, Oracle Container Registry, and SQLcl

    Podman, Oracle Container Registry, and SQLcl

    Containers are kind of like terrariums.


    Table of Contents

    Click a link to jump straight to that section.

    1. Intro to Podman
    2. What I am trying to do
    3. Getting all the prerequisites
    4. Installing Podman Part 1
    5. Resolving X-code (xcrun) problems
    6. Installing Podman Part 2
    7. Exploring the Oracle Container Registry
    8. Getting to know the container
    9. Reviewing container custom configurations
    10. Attempting to understand ports
    11. Starting the Podman Linux VM
    12. Reviewing database logs
    13. Changing the default password
    14. Logging on with SQLcl
    15. Creating, loading a new table
    16. SQL query on the new table
    17. About data persisting in Podman
    18. Dénouement

    Intro to Podman, man

    What is Podman?

    Podman is a daemonless container engine for developing, managing, and running OCI Containers on your Linux System. Containers can either be run as root or in rootless mode. Simply put: alias docker=podman.

    the deets

    I’ve spent the past couple of weeks setting up Podman to work on my MacBook.

    my macbook pro setup, chris hoina, senior product manager, oracle database tools, oracle cloud, ords, oracle rest apis, sql developer, podman, sqlcl
    My current setup

    I really wanted to take advantage of our Oracle Container Registry. While, we have several containers, the one I’m most interested in is the [Ace] freely available Oracle Enterprise Database version 21.3.0.0.

    FYI: It can be found in the Database containers category.

    I wanted to learn more about containers while also connecting locally (i.e., from my MacBook) via SQLcl to said container. In that scenario, as far as my computer thinks, the container is a production database running elsewhere in the world. Oh, and I’m using Podman instead of Docker to do all this. 

    A summary of what I intend to do

    In short, wanted to: 

    • provision a database inside a container
    • be able to start and stop it at will, and 
    • be able to connect to it with various Oracle database tools

    I began this exercise with SQLcl since it was used in one of my recent posts. But as a follow-on to this article, I’d like to install ORDS on my local computer and then connect again but with ORDS joining the party. But that’s for another time. 

    Let’s make ah the pizza

    But before connecting to this container, you’ll need a lot of prerequisites. As far as “ingredients” go, you’ll need the following:

    • Homebrew installed and updated. (If you need to do this, review my recent article for instructions.)
    •  Podman installed and updated
    •  Apple X-Code Command Line Tools updated (this is tricky, so check my notes below)
    •  SQLcl (you can review the installation steps here)
    •  A sample CSV file (The subject doesn’t matter; I grabbed one from Kaggle – “IMDb’s Top 100 Movies“)

    The setup before the setup

    Since this was such a huge PITA, I’m going to walk through all the steps I took to make Podman work on my MacBook. I’ve done this about ten times so far to make sure I’m clearly explaining every step I took.

    I first opened up a new Terminal session using Spotlight (Left Command + Spacebar). Once in Spotlight, I searched for “terminal” and then hit enter. A new Terminal window will appear.

    From there, I reviewed Homebrew using the brew list command. If you’re following along, you’ll see a list similar to mine, depending on what you have installed.

    reviewing brew installations in terminal window, chris hoina, senior product manager, oracle database tools, oracle cloud, ords, oracle rest apis, sql developer, podman, sqlcl
    Wo ist mein (or is it meine?) Podman?

    Attempting to install Podman on Mac

    The next part is easy. I installed Podman with the following command: brew install podman. Homebrew will run through an auto-update, and eventually, Podman will begin installing.

    Upon the first installation, and depending on the macOS you are on, you may see a couple of errors appear. I can tell you they will cause issues within Podman later on down the line. So (and without having to take you back down the rabbit hole with me), you’ll need to uninstall Podman with the brew uninstall podman command.

    The errors can be seen in this image:

    errors found when installing podman, chris hoina, senior product manager, oracle database tools, oracle cloud, ords, oracle rest apis, sql developer, podman, sqlcl
    Errors with Podman installing

    There are a few ways one can remedy this. First, you should uninstall Podman, close your Terminal window, and open up a new Terminal window. I found (via this GitHub issue) that this is a known bug. Some have suggested running the brew doctor command to review a list of possible problems (this will reveal any potential problems Homebrew has discovered). This seems like a good practice, regardless, and I wasn’t aware of this feature until now!

    And while writing this article, I did just that and found two errors I’ll need to fix. I’m still trying to figure out what either means, but the one about the executable is troubling.

    errors found when using brew doctor command, chris hoina, senior product manager, oracle database tools, oracle cloud, ords, oracle rest apis, sql developer, podman, sqlcl
    Errors found with brew doctor command

    X-Code problems

    But back to the Podman issue. To resolve the xcrun errors, I stumbled upon a solution buried deep in the recesses of the internet. Long story short, I needed to manually install Apple’s X-Code Command Line tools. But if you try and the installation fails, you have to take an extra step.

    77 hours to install xcode command line tools, chris hoina, senior product manager, ords, db tools
    BTW, it did NOT take 78 hours to download and install

    The x-tra step

    If the xcode-select --install command fails, you have to remove the Command Tools from your machine altogether. I did this with the following command:

    sudo rm -rf /Library/Developer/CommandTools

    If you want to bore yourself with this issue, here are some resources I found: 


    On sudo

    I’ve seen sudo in the past; I wonder if I ever bothered to look up its meaning. Taken directly from the docs:

    Sudo (su “do”) allows a system administrator to give certain users (or groups of users) the ability to run some (or all) commands as the superuser or another user, while logging all commands and arguments. Sudo operates on a per-command basis, it is not a replacement for the shell.

    A short introduction

    Back to our regularly scheduled program…you’ll probably need to enter your system’s password (the thing you use to log on to your computer when it first starts up and/or wakes). And after that, restart your Terminal (I don’t believe I did this, but it’s probably a good idea to restart the Terminal).

    Once that new Terminal window fired up, I used the following command to install the latest X-Code Command Line tools:

    sudo xcode-select --install

    Reminder, it will not take 78 hours to install this. I just followed the prompts (license terms, the usual stuff, etc.).

    NOTE: I suspect we have to do this because for some reason, X-Code Command Line tools are not updated upon every macOS version update. So, who knows when the last time these tools have been updated. This is just a hunch, but in reality, I've no idea what I'm talking about. 

    Installing Podman, second attempt

    Once that installation was completed, I restarted the Terminal and issued the command to install Podman:

    brew install podman 
    NOTE: If you want to follow along with the Podman instructions, you can do that here (those are the actual docs, very easy to do). 

    If all goes well, your output will look something like this:

    error free since 2023, brew install with podman, no issues, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    No errors after the updating Command Line tools

    If you issue the brew list command, you’ll see both SQLcl and Podman installed.

    brew list to check currently installed packages, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    Ready to roll with SQLcl and Podman
    PRO TIP: You can issue the brew upgrade sqlcl command to update to the latest brew-available version of SQLcl.

    To the Oracle Container Registry

    Are we ready yet? Well, almost. Again, if you’re following along, navigate to our Oracle Container Registry site to retrieve the database container for this “recipe.” The path I took was Landing page > Database > Enterprise.

    YOU NEED TO SIGN IN for this to work!!! Oh, suuuuure… it’ll seem like it’s working when you’re in Podman, and you’ve tried ten times…but then it just keeps failing, and failing, and failing! So be sure to sign in (or create an account if you haven’t already). 

    Once signed in and chosen your preferred language, you’ll see this:

    signed into oracle container registry, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    I’m ready to head back to the Terminal

    PAUSE: reviewing the limitations of this exercise

    Alright, so there are a few limitations I should address, and in no particular order:

    • Checkpointing containers in Podman
    • Volumes in Podman
    • Creating the database versus signing on to the database

    Checkpointing currently doesn’t work in Podman (at least for Macs on macOS Ventura). This is documented as well. Here’s a GitHub issue I found. I don’t seem to be the only one with the issue. I spent about a day on this trying to get it to work. I couldn’t figure it out; maybe if you’re reading this, you know the secret. Please share if you know!

    Secondly, I couldn’t figure out how to mount a volume to a container. I know this is fundamental to containers, but I encountered error after error for days. And for the purposes of this exercise, it isn’t a big deal. Now, if I were on an actual development team, that would be a different story. But I’m too dumb for development, that is why I’m a product manager 🤣!

    Finally, working with containers requires a paradigm shift. Shortly you’ll see that I’m setting up a container and “starting” the database therein. Later, I’ll separately log on to that database, using SQLcl, after the database is up and running. They are two different steps.

    Working with the container

    Looking at this screen you would think, “I’m just going to jump right in and execute the first command I see on this page.” Wrong!

    first docker command on registry page, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database  copy
    Initial docker run command

    Actually, you do NOT want to do that. You must scroll down to the “Connecting from outside of the container” section. Because I’m going to be connecting to this container from the outside.

    referring to the custom configurations, oracle container registry page, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database  copy
    Referring to the Custom Configurations section

    I know this documentation mentions SQL*Plus, but this all applies to SQLcl also. And if you refer to my previous SQLcl post, you can review the logon syntax for logging on. The critical point is that I need to start the container with the -p (or Port) option included. Are you still with me? Let’s take a trip to the “Custom Configurations” section.

    Reviewing the custom configurations section

    custom configurations for connecting to the container, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    Referring to the -p configuration parameters

    I’ve cleaned up the code block for readability:

    docker run -d --name <container_name> \
     -p <host_port>:1521 -p <host_port>:5500 \
     -e ORACLE_SID=<your_SID> \
     -e ORACLE_PDB=<your_PDBname> \
     -e ORACLE_PWD=<your_database_password> \
     -e INIT_SGA_SIZE=<your_database_SGA_memory_MB> \
     -e INIT_PGA_SIZE=<your_database_PGA_memory_MB> \
     -e ORACLE_EDITION=<your_database_edition> \
     -e ORACLE_CHARACTERSET=<your_character_set> \
     -e ENABLE_ARCHIVELOG=true \
     -v [<host_mount_point>:]/opt/oracle/oradata \
    container-registry.oracle.com/database/enterprise:21.3.0.0
    
    Parameters:
    --name
        The name of the container (default: auto generated
     
    -p
        The port mapping of the host port to the container port.
        Two ports are exposed: 1521 (Oracle Listener), 5500 (OEM Express)
     
    -e ORACLE_SID
        The Oracle Database SID that should be used (default:ORCLCDB)
     
    -e ORACLE_PDB
        The Oracle Database PDB name that should be used (default: ORCLPDB1)
     
    -e ORACLE_PWD
        The Oracle Database SYS, SYSTEM and 
        PDBADMIN password (default: auto generated)
     
    -e INIT_SGA_SIZE
        The total memory in MB that should be used for all 
        SGA components (optional)
     
    -e INIT_PGA_SIZE
        The target aggregate PGA memory in MB that should be used 
        for all server processes attached to the instance (optional)
    
    -e ORACLE_EDITION
        The Oracle Database Edition (enterprise/standard, default: enterprise)
    
    -e ORACLE_CHARACTERSET
        The character set to use when creating the database (default: AL32UTF8)
     
    -e ENABLE_ARCHIVELOG
        To enable archive log mode when creating the database (default: false). 
        Supported 19.3 onwards.
     
    -v /opt/oracle/oradata
        The data volume to use for the database. Has to be writable by the 
        Unix "oracle" (uid: 54321) user inside the container If omitted the 
        database will not be persisted over container recreation.
     
    -v /opt/oracle/scripts/startup | /docker-entrypoint-initdb.d/startup
        Optional: A volume with custom scripts to be run after database startup.
        For further details see the "Running scripts after setup and on
        startup" section below.
     
     -v /opt/oracle/scripts/setup | /docker-entrypoint-initdb.d/setup
        Optional: A volume with custom scripts to be run after database setup.
        For further details see the "Running scripts after setup and on startup" 
        section below

    I believe the colons you see throughout the original code block (with certain exceptions) are there for the definitions (you wouldn’t actually include these in your commands). If you are coming from database development, I suspect some may think, “ahh, bind parameter.” I do not think that is the case here.

    Ports

    You might be asking, in this code block, what the hell am I supposed to be looking at? Well, the container has a “listener”, listening on port 1521. So if I want to connect to the container, I’ll need to “map” to it. I’m not sure if that is what it is called exactly (not a networking guy, don’t claim to be). But the next question is, what is my <host port> (How it is referred to in the code block above)?

    Everything matters, and nothing matters

    Executing a ping command in my Terminal, to see what my computer’s address is great, but it tells me nothing about the port.

    ping localhost to find ip address, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    Use ping localhost to see your IP address

    So I took to the internet to try to figure out the appropriate port…Honestly, I’ve tried searching but I can’t find anything definitive. In the Podman documentation, I see a lot of reference to port 8080; as in localhost:8080.

    Care to review it? Here are some search results using “8080” as the search parameter.

    Buried in the docs, there is a brief mention of the port and it can be found in the –publish or -p parameter. The way I understand ports on your local machine is that if you omit the local host information, you shouldn’t have any problems. It will just default to…something. So..it doesn’t matter, nothing matters. It’s all an illusion.

    I also reviewed the cURL documentation. I found something in the --connect-to option:

    curl connect-to option, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    Can you leave it empty?

    Aaaand, more port nonsense (if you are having trouble sleeping at night, try reading this):

    Starting the Podman Linux Virtual Machine

    Okay, with all this out of the way, I can finally start to make some progress (almost there, promise).

    Remember, you have to start the Podman Linux virtual machine before you do anything (this is in the instruction steps, so review that first (steps for macOS). This is where the container “lives.” Once the virtual machine is up and running.

    podman machine start success, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    Podman virtual machine is ready

    I then grabbed the Oracle container. But, since I’m using Podman I needed to modify the run command, like this:

    podman run -d -p :1521 --name myoracledb container-registry.oracle.com/database/enterprise:21.3.0.0
    REMINDER: Make sure you are logged into the Oracle Container Registry site before you attempt all this!

    Assuming you’re still following along, you’ll see something like this in your Terminal:

    Container has been created, and is starting, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    New container ID

    I used the podman ps command to check the status of the container. You should see something like this:

    starting container and healthy status, Container has been created, and is starting, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    “Starting” and “Healthy” statuses

    For several minutes, you’ll continue to see the container status as “starting”. You can do like me and just periodically enter the podman ps command, or you can go do something meaningful with your time. Check back in 10 mins or so. The choice is yours. Either way, you’ll eventually see the status change from “starting” to “healthy”.

    container is now healthy, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    The container is healthy and ready

    “Healthy” means I now have an Oracle Enterprise database (version 21.3.0.0) running in my Linux virtual machine as a container. I still need to log in with SQLcl, though.

    But wait…there’s more

    Hold up, I can’t just log into SQLcl. I still have some more setup to do. I need to reset the randomly generated password to one of my choosing. Our instructions tell you to first issue the docker logs + [your database name] command to view your database logs. And from there you should be able to locate it. I couldn’t maybe you can. Let me know if you were able to.

    podman logs your database name for password, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    Using the logs command

    Since I’m doing this in Podman, that command is slightly modified:

    podman logs myoracledb

    The printout of that command will appear like this (yours will be very similar). Although I wasn’t able to locate the password, there are still some important pieces of information that you’ll want to review and note.

    podman logs database output, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    podman logs [your database name] printout

    In this print out you’ll see things like the local host and port information, and the “Global Database Name” and “System Identifier (SID)” can be seen as well. You’ll see where the log files are located (your temporary password can be retrieved from here) and the database version you are running. Finally, you’ll see the message “DATABASE IS READY TO USE!”

    Use the included shell script for changing your password

    We are this close to logging onto the database. Even though I couldn’t find the temporary password, it doesn’t matter. You have to change your password anyways. If you refer back to the instructions on the Oracle Container Registry page, there is a section entitled “Changing the Default Password for SYS User” and it reads as such (emphasis added):

    On the first startup of the container, a random password will be generated for the database if not provided. The user [must] change the password after the database is created and the corresponding container is healthy.

    Using the docker exec command, change the password for those accounts by invoking the setPassword.sh script that is found in the container. Note that the container must be running. For example:

    $ docker exec <oracle-db> ./setPassword.sh <your_password>
    Oracle Docs

    Easy enough, and since my container is “healthy” at this point, I can execute this script. But since I’m using Podman, the command will look like this:

    podman container exec myoracledb ./setPassword.sh password1234

    And the output of that command will look like this:

    altering password with provided shell script, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    Automated password change with the provided Shell script

    I guess it worked. As you can see, my new password is password1234 (pleeeeease, do NOT share that with anybody). And at this point, I’m ridiculously close to logging onto this completely containerized Oracle enterprise database. All I need to do now is log on using the same steps as before (in my previous post).

    Connecting from outside the container

    Referring back to the Oracle Container Registry docs, I see the following:

    connecting from outside container, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    The different login options
    NOTE: Remember I'm logging into this container from the outside. 

    The connect options are cut-off in that image, so let me copy/paste them here. Also, assume where it states “sqlplus” I’ll be connecting with SQLcl. The options are as follows:

    $ sqlplus sys/<your_password>@//localhost:<exposed_port>/<your_SID> as sysdba
    
    $ sqlplus system/<your_password>@//localhost:<exposed_port>/<your_SID>
    
    $ sqlplus pdbadmin/<your_password>@//localhost:<exposed_port>/<your_PDBname>

    At this point, I can proceed. I have all the ingredients for this “recipe.”

    Use the port command, duh

    Turns out you can just use the port command to discover the container’s port (I’m guessing this is the route the container uses to communicate with my MacBook – it’s all quite muddled at this point).

    Here is the command I executed:

    podman port myoracledb

    And here is what was returned:

    exposing ports for container on podman, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    Exposing the ports for this network

    If you are starting your journey from the MacBook, its address would be 0.0.0.0 with a port of 43073. Data/info flows in and out of that port. And 1521 is a reference to the [bleep blurp 🤖] TCP port at which the Transparent Network Substrate (TNS) Listener is located.

    Actually, if you look at the previous output (from the podman logs myoracledb command) you’ll see how all the addresses and ports connect (including the TNS Listener).

    tns listener in podman container, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    TNS Listener information
    It's in the logs, how could you not know this!?

    Honestly, this is all ludicrous. You shouldn’t know all this, nobody should! It’s too much to retain, but that’s okay, I’m glad you’re still here…toughing it out with me. Once you get past this first big hurdle, I imagine working with containers is very fun.


    Logging on with SQLcl, for real

    Here is where I actually logged on (or is it logged into?) to this database with SQLcl. Here’s the command I used:

    sql sys/password1234@//localhost:43073/ORCLCDB as sysdba

    Which, if you recall is modeled on the original command found in the Oracle Container Registry docs; it looks like this (it’s also a few paragraphs back):

    $ sqlplus sys/<your_password>@//localhost:<exposed_port>/<your_SID> as sysdba

    NOTE: Exposed port is that where the TNS Listener is located, and the SID is the “System Identifier” – I showed that in the database logs earlier.

    And again, I don’t think it matters if you include the localhost port. Here is what the output looked like in my Terminal:

    connecting via sqlcl to podman container, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    Connecting to the container with SQLcl

    Create and load a new table with SQLcl

    Alright, so finally, I’m in! Next, I tested a SQLcl function to see if it worked as expected. I chose the LOAD function. And just as a refresher on the LOAD function I referred to the SQLcl help for in-context assistance. That help looks like this:

    help and more load info, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    SQLcl Help and LOAD information

    Specifically, I am going to test out the “CREATE TABLE” and “LOAD TABLE” function. So I scrolled down to the examples for reference.

    create and load table function in sqlcl, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database copy
    Example showing how to create and load a table at the same time

    At this point, the commands are pretty straightforward. I decided to use the LOAD NEW command, as seen above.

    PRO TIP: You can simply drag the file and drop it into Terminal to get the complete file path.

    DON’T forget to include the “NEW” at the end of the command. I forgot it the first time and my load failed. If doesn’t break anything, just a silly mistake.

    I hit enter, and if you look at that image with the “Ready to execute” caption, everything worked as expected, here it is a zoomed-in (please excuse the gray shading):

    new table processed without errors, chris hoina, senior product manager, oracle database tools, oracle cloud, ords, oracle rest apis, sql developer, podman, sqlcl
    Processed without errors 🫠

    SQL queries on the newly created table

    Alright, so I have a brand new table. And if you recall, this was a data set that included the IMDb top 100 highest-rated movies of all time.

    kaggle imdb top 100 dataset, Chris Hoina, Senior Product Manager, ords, Database Tools, Oracle Database
    IMDb dataset by way of Kaggle

    Well in the next few images, I wanted to test some SQL on the new table. I first searched by genre, with the following SQL:

    SELECT DISTINCT genre FROM t100movies;
    selecting distinct from new table, chris hoina, senior product manager, oracle database tools, oracle cloud, ords, oracle rest apis, sql developer, podman, sqlcl
    Selecting by the movie genre

    Which returns all the distinct matches. Easy enough right? Then (because I like Adventure and Fantasy) I selected only those films that match those criteria, with this SQL statement:

    SELECT * FROM t100movies WHERE genre = 'Adventure, Fantasy';
    selecting specific genres and reviewing results, chris hoina, senior product manager, oracle database tools, oracle cloud, ords, oracle rest apis, sql developer, podman, sqlcl
    Single quotes and don’t forget your semi-colon 🫡

    And once I correctly entered the SQL query, I found a single movie in the top 100 that meets those criteria. I’m actually surprised this one made the top 100 list.

    Data persists in your container

    one more thing, chris hoina, senior product manager, oracle database tools, oracle cloud, ords, oracle rest apis, sql developer, podman, sqlcl

    The files are in the [container]!

    Derek Z. and Hansel M.

    Okay, but there’s just one more thing. The data persists inside the container even after I’ve stopped it. This isn’t necessarily the focus of this article, but I just wanted to demonstrate that even after shutting everything down, the table still exists.

    This is true even after completely stopping my Podman container and shutting down the Podman Linux virtual machine.

    The process was as follows:

    • Exited out of the SQLcl application
    • Stopped the myoracledb container process
    • Checked to make sure the process was actually stopped
    • Stopped and then restarted the Podman Linux virtual machine
    • Restarted the myoracledb container
    • Executed the same SQL query as before
    • Exited from the SQLcl application a final time

    And if you take a look at all these images (they are numbered in order) you can see all the steps I took to during this little test. Pretty cool, eh!?

    A couple of notes here:

    • When I restarted the container, it only took about a minute for it to move from a “starting” to a “healthy” status. So I think the first time you start this container it takes a while. Subsequent start-ups, a minute or less.
    • When you start back up the container, you don’t have to map anything, I believe all those settings are still intact. Either that or I just spent a whole bunch of time networking when I didn’t need to.

    The end

    And that does bring us to a close. If you’ve made it this far, you understand how you can at least get this container started and log on using SQLcl. Remember you can use Brew to install SQLcl and Podman. And of course, you’ll need to get the container I used (from our Container Registry; you can find it here in the Database category).

    Remember, I didn’t do anything with checkpoints (or checkpointing containers) or with volumes. At the time of this article, I wasn’t 100% confident in my approach, so I wanted to exclude it. However, as I understand it, volumes (and their use) are the preferred approach to persisting data for later use (in containers). Just keep that in mind.

    Finally, I’m not going to sugarcoat it. This was a grind – it was very tedious and frustrating, so hopefully, you can learn from my mistakes. I’m not claiming to be a developer or an expert in CI/CD. But I can totally see the appeal of having portable containers like this. The barrier to understanding stuff like this is incredibly high, so good luck to you. But hey, if you screw up, don’t worry about it. You can always uninstall and reinstall and try again.

    Be sure to leave a comment if you see something amiss or if you have a better approach to something I’ve shown here. And as always…

  • Where to learn SQL: a Slack message

    Where to learn SQL: a Slack message

    What follows is a response I sent via Slack to one of our newest UX Designers. She comes to us by way of another sister business unit within Oracle. She was looking for some resources on where to learn/get better acquainted with SQL (Which for a UX and/or UI designer, I think is a really impressive!).

    As I was putting this list together, the thought occurred to me, “hey, I think this stuff could be helpful to others outside of Oracle too!” So here we are. What follows (with minor edits) is a list of resources that I’ve found over the past year that have helped me to better understand SQL. Maybe you’ll discover another previously unknown resource.

    Resources

    In no particular order (seriously, these came to me at random):

    Update to the list. Thanks to Chris Saxon!

    Dev Gym

    dev gym learn sql at oracle, chris hoina, senior product manager, ords, oracle autonomous database
    Learn SQL with Dev Gym

    What is it? Verbatim, here is the “elevator pitch” from our site:

    Learn SQL in this FREE 12-part boot camp. It will help you get started with Oracle Database and SQL. The course is a series of videos to teach you database concepts, interactive SQL tutorials, and quizzes to reinforce the ideas. Complete the course to get your free certificate.

    Link to course

    Hey, pretty cool you end up with a free certificate too!

    NOTE: You'll need to create an Oracle account first. You can sign-up here. 

    O’Reilly

    oreilly welcome screen learning, chris hoina, senior product manager, oracle, ords, database actions, sql developer, sqlcl
    O’Reilly Welcome page

    Within Oracle, we have access to O’Reilly. You may, too, check internally. This is the second employer I’ve seen where this is available. It’s chock full of digital learning content – videos, tutorials, books, and guides. You can even create “Playlists” for similar topics, here are mine:

    oreilly playlists, chris hoina, senior product manager, oracle, ords, database actions, sql developer, sqlcl
    My O’Reilly playlists

    Live SQL

    Oracle Live SQL landing page, chris hoina, senior product manager, oracle, ords, database actions, sql developer, sqlcl
    Oracle Live SQL

    Oracle has a browser-based app, Live SQL, where you can learn all sorts of SQL. I don’t learn like this, but others might (I need skill acquisition to be more practical). If you learn through rote, then this is the site for you!

    SQL Worksheet in the Database Actions Launchpad

    sql worksheet in database actions, sql developer web, chris hoina, senior product manager, ords, oracle autonomous database
    SQL Worksheet via Database Actions

    Sign up for one of our OCI Free Tier accounts and create an Autonomous Database (ADB). After that you can get a feel for how Database Actions (aka SQL Developer Web) works and how to interact with your database.

    From there, if you want to look at SQL specifically, I would focus on the SQL Worksheet. Once there, you can practice simple SQL queries.

    Reader: I don't know your level, so you may already be more familiar with this than me. But it's free, so why not?

    LiveLabs

    Typing SQL in LiveLabs, chris hoina, senior product manager, ords, oracle autonomous database
    SQL learning in LiveLabs

    This is a straightforward and approachable entry point. Simply typing “sql” reveals tons of relevant workshops. LiveLabs home.

    Oracle SQL Language Guide

    This is the official guide for the current Oracle version 21 database. It would be a good thing to bookmark. But there is so much stuff; you’d want to skip sitting down and reading through it in one sitting.

    PL/SQL

    This is a PL/SQL language guide. I can only explain PL/SQL as “SQL Advanced.” It’s not better; it is a way to give you more control over when, how, and where to use SQL (my interpretation). Wikipedia does a better job of explaining. You won’t be using this initially. I’m just starting to get into it after a year. But the sooner you can use it, the better!

    W3 Schools

    Great for many languages (as well as CSS/HTML). It is a memory HOG, though! I don’t know what is happening (probably the ads), but at least on Firefox, your computer’s fans will be working double-time. So get in, get out; otherwise, your computer will slow to a crawl. Link to SQL topic.

    Errors/troubleshooting

    StackOverflow

    Using the error code (or parts of it) as a keyword in StackOverflow works quite well. Like this:

    Random Oracle error in Stack Overflow search, chris hoina, senior product manager, oracle, ords, database actions, sql developer, sqlcl
    Using a random error code as an example

    You can even create Watched Tags to keep up on topics that you are most interested in.

    Watched Tags in Stack Overflow, chris hoina, senior product manager, oracle, ords, database actions, sql developer, sqlcl
    Watched Tags in StackOverflow

    Link to StackOverflow.

    YouTube

    YouTube works well too (duh). Particularly this user!

    Nothing fancy about these two suggestions...I almost always defer to practical videos for learning/diagnosing issues.

    The End

    And that’s all I can think of for right now. Those should set you on the right path. Did I miss anything?

    Leave a comment for your go-to SQL resources, I’ll add them to the list!

  • Using Python Pandas to turn ISO Country Codes into a string to use as values for a SQL Query

    Using Python Pandas to turn ISO Country Codes into a string to use as values for a SQL Query

    Summary, code, resources

    Problem

    While querying a table (based on this dataset) with SQL, you realize one of your columns uses 3-character ISO Country Codes. However, some of these 3-character codes aren’t countries but geographical regions or groups of countries, in addition to the actual country codes. How can you filter out rows so you are left with the countries only?

    Answer

    Use the Python Pandas library to scrape ISO country codes and convert the values to one single string. Then use that string as values for a subsequent SQL query (possibly something like this):

    SELECT * FROM [your_table]
    WHERE country_code IN ([values from the generated list-as-string separated by commas and encased by single / double quotes]);

    Code

    # Libraries used in this code
    from bs4 import BeautifulSoup
    import requests
    import csv 
    import pandas as pd 
    
    # I found these ISO country codes on the below URL. Pandas makes it easy to read HTML and manipulate it. Very cool!
    iso_codes = pd.read_html("https://www.iban.com/country-codes")
    
    # I create a data frame, starting at an index of 0.
    df = iso_codes[0]
    
    # But really, all I care about is the 3-digit country code. So I'll make that the df (dataframe) and strip out the index
    df = df['Alpha-3 code'].to_string(index=False)
    
    # From here, I'll save this little guy as a text file.
    with open("./countries.txt", "w") as f:
        f.write(df)
    
    # I'll set up a list. *** This was my approach, but if you find a better way, feel free to comment or adjust. ***
    my_list = []
    
    # Then I'll open that text file and read it in.
    file = open("./countries.txt", "r")
    countries = file.read()
    
    # I need to remove the "new line" identifiers, so I'm doing that here. 
    my_list = countries.split('\n')
    
    # Once I do that, I can create two new strings. I do this with f-Strings. Great article on using them here: https://realpython.com/python-f-strings/ 
    
    # I have two options here: one where the codes are contained by single quotes, the other with double quotes. Oracle Autonomous Database likes single quotes, but your DB may differ.
    
    countries_string_single_quotes = ','.join(f"'{x}'" for x in my_list)
    
    countries_string_double_quotes = ','.join(f'"{x}"' for x in my_list)
    
    # From here, I take those strings and save them in a text file. You don't have to do this; you can print and copy/paste the string. But this might be an excellent addition if you want to refer to these later without running all the code. 
    
    with open("./countries_as_list_single_quotes.txt", "a") as f:
        f.write(countries_string_single_quotes)
    
    with open("./countries_as_list_double_quotes.txt", "a") as f:
        f.write(countries_string_double_quotes)

    GitHub repo details

    You can find the code from this post in my GitHub repository. The repository consists of the following:

    • The Python code I created for solving this problem
    • A countries.txt file, which is produced midway through the code (temporary placeholder for later processing)
    • ‘Single quotes’ .txt file – the 3-character ISO Country Codes are formatted as a string. The values are enclosed by single quotes; commas throughout
    • “Double quotes” .txt file – the 3-character ISO Country Codes are formatted as a string. The values are enclosed by double quotes; commas throughout

    Resources


    The in-depth walkthrough

    Backstory

    A few days ago, I stumbled upon this question in the r/datasets subreddit:

    I spent most of the morning figuring out how I would go about this, and after some trial and error, I devised a plan. I decided to take the list of ISO Country Codes (which I found here) and use them as values for filtering in a SQL statement (later on in Oracle SQL Developer Web).

    After some research, I figured out the proper SQL syntax for a successful query.

    SELECT * FROM [your_table]
    WHERE country_code IN ([values from the generated list-as-string separated by commas and encased by single / double quotes]);

    From there, I knew I needed to work backward on those ISO Country Codes. Meaning I needed to take something that looked like this:

    And turn it into something more workable. It turns out that grabbing this was pretty straightforward. I’m using Pandas primarily for this exercise, but first, I need to import some libraries:

    # Libraries used in this code
    from bs4 import BeautifulSoup
    import requests
    import csv 
    import pandas as pd 

    Next, I’ll use Pandas’ read_html function (this feels like cheating, but it’s incredible) to read in the table.

    # I found these ISO country codes on the below URL. Pandas makes it easy to read HTML and manipulate it. Very cool!
    iso_codes = pd.read_html("https://www.iban.com/country-codes")
    
    # I create a data frame, starting at an index of 0.
    df = iso_codes[0]

    This is wild, but this is what the printout looks like:

    pandas to_html printout of ISO codes table, chris hoina, senior product manager, oracle rest apis, oracle autonomous database, sql developer web, ORDS
    The Pandas read_html() the function is powerful.

    If you squint, you can see an “Alpha-2 code” and an “Alpha-3 code” column in the image. From here, I need to isolate the 3-code column. So I reshaped the data frame by making it a single column; dropping the index (this is optional, you could keep the index if you needed it; perhaps you wanted to create a separate table in your database).

    # But really, all I care about is the 3-digit country code. So I'll make that the df (dataframe) and strip out the index
    df = df['Alpha-3 code'].to_string(index=False)

    I’ll save this data frame as a .txt file.

    # From here, I'll save this little guy as a text file.
    with open("./countries.txt", "w") as f:
        f.write(df)

    This is only temporary (FYI: this is the only way I could figure out how to do this). It’ll look like this:

    country codes as a temporary text file, chris hoina, senior product manager, oracle rest apis, oracle autonomous database, sql developer web, ORDS
    The temporary .txt file of 3-character ISO Country Codes.

    Next, I take that temporary text file and read it in. I’m going to add it to a list, so I’ll first create the empty list (aptly named “my_list“). I also need to remove the newline characters from the list; otherwise, if I don’t, then when I create my string of values (that comes in the final step), the string will look like this:

    countries string with newline characters, chris hoina, senior product manager, oracle rest apis, oracle autonomous database, sql developer web, ORDS
    The “countries” string with “\n” characters.

    I remove the newline characters with this piece of code:

    # I need to remove the "new line" identifiers, so I'm doing that here. 
    my_list = countries.split('\n')

    The almost string of values will look like this:

    viewing ouput of my_list, chris hoina, senior product manager, oracle rest apis, oracle autonomous database, sql developer web, ORDS
    New line characters have now been removed.

    I use F-Strings to create the following two strings; countries_strings_single_quotes and countries_strings_double_quotes, respectively. Need to learn about F-Strings (or, more formally, Literal String Interpolation)? No problemo! Check out these three resources:

    The code for the F-Strings is below. I loop through my_list and separate the x (the things I’m iterating over) with commas (that’s the join).

    # Once I do that, I can create two new strings. I do this with f-Strings. Great article on using them here: https://realpython.com/python-f-strings/ 
    
    # I have two options here: one where the codes are contained by single quotes, the other with double
    # quotes. Oracle Autonomous Database likes single quotes, but your DB may differ.
    
    countries_string_single_quotes = ','.join(f"'{x}'" for x in my_list)
    
    countries_string_double_quotes = ','.join(f'"{x}"' for x in my_list)

    And now that I have these two objects (are they called objects??). I’ll save them each as a text file. One file has the 3-character codes surrounded by single quotes, the other with double quotes. The code:

    # From here, I take those strings and save them in a text file. You don't have to do this; you can print
    # and copy/paste the string. But this might be a nice addition if you want to refer to these later 
    # without running all the code. 
    
    with open("./countries_as_list_single_quotes.txt", "a") as f:
        f.write(countries_string_single_quotes)
    
    with open("./countries_as_list_double_quotes.txt", "a") as f:
        f.write(countries_string_double_quotes)

    The text files look like this now:

    side by side comparison of newly created text files, chris hoina, senior product manager, oracle rest apis, oracle autonomous database, sql developer web, ORDS
    The country codes are now presented in one long string. Pretty cool, eh?

    SQL time

    We have arrived! Let me show you what I can do now!

    I took the CSV data from the World Bank and loaded it into my Autonomous Database. Our returning intern Layla put together a video of how to do this; you can check it out here:

    Once my table was created, I did a SELECT [columns] FROM. Here you can see my “beginning state”.

    There are 266 entries; some are countries, and others are not. And if you recall, the original question asked how somebody could filter out the non-countries. Onto that next!

    This is the best part. I can take the string I made and use that in a SQL query such as this:

    SELECT * from ADMIN.REDDIT_TABLE
    WHERE COUNTRY_CODE IN('AFG','ALA','ALB','DZA','ASM','AND','AGO','AIA','ATA',
    'ATG','ARG','ARM','ABW','AUS','AUT','AZE','BHS','BHR','BGD','BRB','BLR','BEL',
    'BLZ','BEN','BMU','BTN','BOL','BES','BIH','BWA','BVT','BRA','IOT','BRN','BGR',
    'BFA','BDI','CPV','KHM','CMR','CAN','CYM','CAF','TCD','CHL','CHN','CXR','CCK',
    'COL','COM','COD','COG','COK','CRI','CIV','HRV','CUB','CUW','CYP','CZE','DNK',
    'DJI','DMA','DOM','ECU','EGY','SLV','GNQ','ERI','EST','SWZ','ETH','FLK','FRO',
    'FJI','FIN','FRA','GUF','PYF','ATF','GAB','GMB','GEO','DEU','GHA','GIB','GRC',
    'GRL','GRD','GLP','GUM','GTM','GGY','GIN','GNB','GUY','HTI','HMD','VAT','HND',
    'HKG','HUN','ISL','IND','IDN','IRN','IRQ','IRL','IMN','ISR','ITA','JAM','JPN',
    'JEY','JOR','KAZ','KEN','KIR','PRK','KOR','KWT','KGZ','LAO','LVA','LBN','LSO',
    'LBR','LBY','LIE','LTU','LUX','MAC','MKD','MDG','MWI','MYS','MDV','MLI','MLT',
    'MHL','MTQ','MRT','MUS','MYT','MEX','FSM','MDA','MCO','MNG','MNE','MSR','MAR',
    'MOZ','MMR','NAM','NRU','NPL','NLD','NCL','NZL','NIC','NER','NGA','NIU','NFK',
    'MNP','NOR','OMN','PAK','PLW','PSE','PAN','PNG','PRY','PER','PHL','PCN','POL',
    'PRT','PRI','QAT','REU','ROU','RUS','RWA','BLM','SHN','KNA','LCA','MAF','SPM',
    'VCT','WSM','SMR','STP','SAU','SEN','SRB','SYC','SLE','SGP','SXM','SVK','SVN',
    'SLB','SOM','ZAF','SGS','SSD','ESP','LKA','SDN','SUR','SJM','SWE','CHE','SYR',
    'TWN','TJK','TZA','THA','TLS','TGO','TKL','TON','TTO','TUN','TUR','TKM','TCA',
    'TUV','UGA','UKR','ARE','GBR','UMI','USA','URY','UZB','VUT','VEN','VNM','VGB',
    'VIR','WLF','ESH','YEM','ZMB','ZWE')
    ORDER BY COUNTRY_CODE ASC;

    Once I execute that SQL statement, I’m left with the countries from that list. I opened up the results in another window so you can see a sample.

    The end

    So yeah, that’s it! I don’t know if this was the best way to go about this, but it was fun. I’m curious (if you’ve made it this far), what do you think? How would you go about it? Let me know.

    And two more things: remember to share this and…

  • More fun with Medium story stats, JSON, Python, Pandas, and Oracle SQL Developer Web

    More fun with Medium story stats, JSON, Python, Pandas, and Oracle SQL Developer Web

    That’s right; I’m back again for yet another installment of this ongoing series dedicated to working with Medium.com story stats. I first introduced this topic in a previous post. Maybe you saw it. If not, you can find it here.

    Recap

    My end goal was to gather all story stats from my Medium account and place them into my Autonomous Database. I wanted to practice my SQL and see if I could derive insights from the data. Unfortunately, gathering said data is complicated.

    Pulling the data down was a breeze once I figured out where to look for these story statistics. I had to decipher what I was looking at in the Medium REST API (I suppose that was somewhat tricky). My search was mostly an exercise in patience (there was a lot of trial and error).

    I uploaded a quick video in the previous post. But I’ll embed it here so you can see the process for how I found the specific JSON payload.

    Obtaining the raw JSON

    Once I found that URL, I saved this JSON as a .json file. The images below show remnants of a JavaScript function captured with the rest of the JSON. I’m no JavaScript expert, so I can’t tell what this function does. But before I load this into my Autonomous Database (I’m using an OCI Free Tier account, you can check it out here if you are curious), it needs to go.


    README

    I am pointing out a few things that may seem convoluted and unnecessary here. Please take the time to read this section so you can better understand my madness. 
    
    FIRST: Yes, you can manually remove the [presumably] JavaScript saved along with the primary JSON payload (see above paragraphs). I'm showing how to do this in Python as a practical exercise. But I'm also leaving open the opportunity for future automation (as it pertains to cleaning data). 
    
    SECOND: When it comes to the Pandas data frame steps, of course, you could do all this in Excel, Numbers, or Sheets! Again, the idea here is to show you how I can clean and process this in Python. Sometimes doing things like this in Excel, Numbers, and Sheets is impossible (thinking about enterprise security here). 
    
    THIRD: Admittedly, the date-time conversion is hilarious and convoluted. Of course, I could do this in a spreadsheet application. That's not the point. I was showing the function practically and setting myself up for potential future automation. 
    
    FOURTH: I'll be the first to admit that the JSON > TXT > JSON > CSV file conversion is comical. So if you have any suggestions, leave a comment here or on my GitHub repository (I'll link below), and I'll attribute you!

    The code

    Explaining the code in context, with embedded comments, will be most illuminating.

    I’ve named everything in the code as literally as possible. In production, this feels like it might be impractical; however, there is no question about what the hell the code is doing! Being more literal is ideal for debugging and code maintenance.

    Here is the entire code block (so CTRL+C/CTRL+V to your heart’s content 😘). I’ll still break this down into discrete sections and review them.

    import csv
    import json
    import pandas as pd 
    import datetime
    from pathlib import Path
    
    # You'll first need to sign in to your account, then you can access this URL without issues: 
    # https://medium.com/@chrishoina/stats/total/1548525600000/1668776608433
    # NOTES:
    # Replace the "@chrishoina" with your username 
    # The two numbers you see are Unix Epochs; you can modify those as # needed; in my case, I
    # wanted to see the following:
    # * 1548525600000 - At the time of this post, this seems to be 
    # whenever your first post was published or when 
    # you first created a Medium account. In this case, for me, this 
    # was Sat, Jan/26/2019, 6:00:00PM - GMT
    # * 1665670606216 - You shouldn't need to change this since it will # just default to the current date.
    
    # For the conversion, I an Epoch Converter tool I found online: https://www.epochconverter.com/
    
    # Step 1 - Convert this to a,(.txt) file
    p = Path("/Users/choina/Documents/socialstats/1668776608433.json")
    p.rename(p.with_suffix('.txt'))
    
    # Step 2 - "read" in that text file, and remove those pesky
    # characters/artifacts from position 0 through position 15. 
    # I'm only retaining the JSON payload from position 16 onward.
    
    with open("/Users/choina/Documents/socialstats/1668776608433.txt", "r") as f:
    stats_in_text_file_format = f.read()
    
    # This [16:] essentially means grabbing everything in this range. Since
    # there is nothing after the colon; it will just default to the end (which is
    # what I want in this case).
    
    cleansed_stats_from_txt_file = stats_in_text_file_format[16:]
    
    print(cleansed_stats_from_txt_file)
    
    # This took me a day to figure out, but this text file needs to be encoded
    # properly, so I can save it as a JSON file (which is about to happen). I
    # always need to remember this, but I know that the json.dumps = dump
    # string, which json.dump = dump object. There is a difference, I'm not
    # the expert, but the docs were helpful. 
    
    json.dumps(cleansed_stats_from_txt_file)
    
    # Step 3 - Here, I create a new file, then indicate we will "w"rite to it. I take the 
    # progress from Step 2 and apply it here. 
    
    with open('medium_stats_ready_for_pandas.json', 'w') as f:
    f.write(cleansed_stats_from_txt_file)
    
    # Step 4 - Onto Pandas! We've already imported the pandas library as "pd."
    # We first create a data frame and name the columns. I kept the names
    # very similar to avoid confusion. I feared that timestampMs might be a
    # reserved word in Oracle DB or too close, so I renamed it. 
    
    df = pd.DataFrame(columns=['USERID', 'FLAGGEDSPAM', 'STATSDATE', 'UPVOTES', 'READS', 'VIEWS', 'CLAPS', 'SUBSCRIBERS']) 
    
    with open("/Users/choina/Documents/socialstats/medium_stats_ready_for_pandas.json", "r") as f: 
    data = json.load(f)
    data = data['payload']['value']
    
    print(data)
    
    for i in range(0, len(data)):
    df.loc[i] = [data[i]['userId'], data[i]['flaggedSpam'], data[i]['timestampMs'], data[i]['upvotes'], data[i]['reads'], data[i]['views'], data[i]['claps'], data[i]['updateNotificationSubscribers']]
    
    df['STATSDATE'] = pd.to_datetime(df['STATSDATE'], unit="ms")
    
    print(df.columns)
    
    # Step 5 - use the Pandas' df.to_csv function and save the data frame as
    # a CSV file
    
    with open("medium_stats_ready_for_database_update.csv", "w") as f:
    df.to_csv(f, index=False, header=True)

    I used several Python libraries I use for this script:

    Step 1

    p =  Path("/Users/choina/Documents/socialstats/1668776608433.json")
    p.rename(p.with_suffix('.txt')

    Pathlib allows you to assign the file’s path to “p”. From there, I changed the .json file extension to a .txt extension.

    Note: Again, I'm sure there is a better way to do this, so if you're reading, leave a comment here or on my GitHub repository so I can attribute it to you 🙃. 

    The before and after of what this step looks like this:

    With that out of the way, I needed to remove that JavaScript “prefix” in the file. I do this in Step 2 (I got so fancy that I probably reached diminishing returns). My approach works, and I can repurpose this for other applications too!

    Step 2:

    # Step 2 - "read" in that text file, and remove those pesky
    # characters/artifacts from position 0 through position 15. Or in other
    # words, you'll retain everything from position 16 onward because that's
    # where the actual JSON payload is.
    
    with open("/Users/choina/Documents/socialstats/1668776608433.txt", "r") as f:
    stats_in_text_file_format = f.read()
    
    # This [16:] essentially means grabbing everything in this range. Since
    # there is nothing after the colon; it will just default to the end (which is
    # what I want in this case).
    cleansed_stats_from_txt_file = stats_in_text_file_format[16:]
    
    print(cleansed_stats_from_txt_file)
    
    # This took me a day to figure out, but this text file needs to be
    # appropriately encoded to save as a JSON file (which is about to
    # happen). I always forget the difference between "dump" and "dumps";
    # json.dumps = dump string, whereas json.dump = dump object. There is
    # a difference, I'm not the expert, but the docs were helpful (you should
    # read them). 
    
    json.dumps(cleansed_stats_from_txt_file)
    
    json with characters that need to be removed in vs code context chris hoina senior product manager oracle rest apis database tools autonomous database
    I needed to remove these remnants from the Medium JSON response

    While this initially came through as a JSON payload, those first 0-15 characters had to go.

    FULL DISCLAIMER: I couldn't figure out how to get rid of this while it was still a JSON file hence why I converted this to a text file (this was the only way I could figure it out).

    I captured position 16 to infinity (or the end of the file, whichever occurs first), then I re-encoded the file as JSON (I interpreted this as “something the target machine can read and understand as JSON“).

    OPEN SEASON: CompSci folks, please roast me in the comments if I'm wrong. 

    Step 3

    # Step 3 - I create a new file, then I'll "w"rite to it. I took the result from Step 2 and applied it here. 
    
    with open('medium_stats_ready_for_pandas.json', 'w') as f: 
    f.write(cleansed_stats_from_txt_file)

    I’m still at the data-wrangling portion of this journey, but I’m getting close to the end. I’ll create a new JSON file, take the parts of the (freshly encoded) text file I need, and then save them as that new JSON file.

    Step 4

    # Step 4 - Onto Pandas! We've already imported the pandas library as "pd"
    
    # I first create a data frame and name the columns. I kept the names 
    # similar to avoid confusion. I feared that timestampMs might be a
    # reserved word in Oracle DB or too close, so I renamed it. 
    
    df = pd.DataFrame(columns=['USERID', 'FLAGGEDSPAM', 'STATSDATE', 'UPVOTES', 'READS', 'VIEWS', 'CLAPS', 'SUBSCRIBERS']) 
    
    with open("/Users/choina/Documents/socialstats/medium_stats_ready_for_pandas.json", "r") as f: 
    data = json.load(f)
    data = data['payload']['value']
    
    print(data)
    
    for i in range(0, len(data)):
    df.loc[i] = [data[i]['userId'], data[i]['flaggedSpam'], data[i]['timestampMs'], data[i]['upvotes'], 
    data[i]['reads'], data[i]['views'], data[i]['claps'], data[i]['updateNotificationSubscribers']]
    
    df['STATSDATE'] = pd.to_datetime(df['STATSDATE'], unit="ms")
    
    print(df.columns)

    I won’t teach Pandas (and honestly, you do NOT want me to be the one to teach you Pandas), but I’ll do my best to explain my process. I first created the structure of my data frame (“df” in this case). And then, I named all the column headers (these can be anything, but I kept them very close to the ones found in the original JSON payload).

    I then opened the newly-saved JSON file and extracted what I needed.

    NOTE: I got stuck here for about a day and a half, so let me explain this part. 

    The data['payload']['value'] refers to the key and value in this particular {key: value} pair. This approach allowed me to grab all the values of “value“. This image explains what I started with (on the left) and what I ended up with (on the right).

    retrieving all the values in the payload-value key-value pair, chris hoina, ords, oracle rest apis, database tools, sql developer web
    The before and after JSON payload

    You’ll notice a {"success": true} key: value pair. With this method, I removed that pair and shed others at the end of the JSON payload.

    three additional key value pairs i remove from the json, chris hoina, senior product manager, ords, oracle rest apis, database tools, sql developer web, oracle autonomous database
    Removing a great deal of trash

    I can’t take credit for organically coming up with this next part; Kidson on YouTube is my savior. I’d watch this video to understand what is happening in this piece of code entirely:

    for i in range(0, len(data)):
        df.loc[i] = [data[i]['userId'], data[i]['flaggedSpam'], data[i]['timestampMs'], data[i]['upvotes'],                                   
        data[i]['reads'], data[i]['views'], data[i]['claps'], data[i]['updateNotificationSubscribers']]

    In short, you take the values from the columns in the JSON file (above) and then put them into the column locations named in this piece of code:

    df = pd.DataFrame(columns=['USERID', 'FLAGGEDSPAM', 'STATSDATE', 'UPVOTES', 'READS', 'VIEWS', 'CLAPS', 'SUBSCRIBERS'])  

    For instance, the "userId" values in the JSON file will all go into the 'USERID' column in the Pandas data frame. And the same thing will happen for the other values and associated (Pandas data frame) columns.

    Finally, I changed the date (which, if you recall, is still in this Epoch format) with the Datetime library to a more friendly, readable date. Using this code:

    df['STATSDATE'] = pd.to_datetime(df['STATSDATE'], unit="ms")

    Step 5

    with open("medium_stats_ready_for_database_update.csv", "w") as f:
        df.to_csv(f, index=False, header=True)

    I’m at the home stretch now. I take everything I’ve done in Pandas and save it as a CSV file. I wanted to keep the headers but ditch any indexing. The clean CSV file will look like this:

    completed csv ready for oracle autonomous database chris hoina senior product manager oracle rest apis database tools autonomous database
    Cleaned, tidy CSV ready for Data Load via SQL Developer Web

    Step 6

    Lastly, I logged into SQL Developer Web and clicked the new Data Load button (introduced in Oracle REST Data Services version 22.3) to upload the CSV file into a new table. The Autonomous Database automatically infers column names and data types. I slightly modified the "statsdate" column (honestly, I could have left it alone, but it was easy enough to change).

    Before and After

    And that’s it! Once uploaded, I can compare what I did previously to what I have achieved most recently. And both ways are correct. For instance, depending on your requirements, you can retain the JSON payload as a CLOB (as seen in the first image) or a more traditional table format (as seen in the second image).

    Wrap up

    If you’ve made it this far, congrats! You should now have two ways to store Medium stats data in a table (that lives in the Oracle Autonomous Database) either as:

    • a CLOB
    • an OG table

    And if you’d like to review the code, you can find it here.

    And until next time 🫠, happy querying.

  • Flask, Python, ORDS, and Oracle CloudWorld 2022

    Flask, Python, ORDS, and Oracle CloudWorld 2022

    Python, Flask, and ORDS

    Restaurants Web Application update

    If you’ve been following along, then you are probably aware of the python/flask/ORDS journey that I’ve embarked on. If not, you can read up on the overview here. The idea was to take local restaurant health inspection scores (aka Sanitary grades) and present the data to a user in a visualization. The web framework I’m using is based on Flask, a popular Python web application framework. And we’ll interact with our Oracle Autonomous Database using our Oracle REST Data Services APIs (for both GET and POST requests).

    That progress is moving along, and I’ve been toying with the Folium library for map visualizations.

    If you are familiar with Leaflet.js then you basically know Folium – a python wrapper used for displaying map renderings from Leaflet. You can get a taste of what I’m working towards in this latest post.

    However, this project is on hold (very, very) temporarily while I become familiar with another Python / Flask / ORDS web application.

    eCommerce/Retail Web Application [NEW]

    Currently titled “Frank’s Franks”, this application is what I’m referring to as an eCommerce type application. You are presented with product information, you then make a product selection, and create an order. There is no credit card gateway, but the application does actually POST order information to a live Autonomous Database.

    This application is the brainchild of our very own Brian Spendolini. But I’m taking a crack at making some additions to it with what I’ve learned from Folium and Bootstrap (HTML and CSS). I’m not sure of the exact final state, but you can follow along as we continue iterating on it (more on this in the GitHub section).

    And here is a very crude example of the index (aka “home”) page:

    And here is a nice screenshot of the python code along with some in-progress HTML:

    Our Flask application is on the left, with the two additional right panes showing the base.html and index.html pages.

    I’m showing this because this will be one of the practical labs that our team will be facilitating in October. Why October you ask? Great question…

    Oracle CloudWorld

    Because that is when Oracle CloudWorld is happening. Queue gratituous copy/paste in 3…2…1…

    Join us at CloudWorld, the new global conference where our customers and partners can share ideas, develop in-demand skills, and learn about cloud infrastructure and applications solutions that serve their unique roles and business needs. At CloudWorld, you’ll experience first-look product demos by Oracle experts, access hands-on training and certification opportunities, and shape the cloud innovation conversation alongside your peers and industry leaders.

    Now I’m not sure who on the team will be attending, but rest assured, one of us will be there.

    GitHub

    Finally, GitHub. Why am I including an ominous “GitHub” section? For two reasons. First, if you want to follow along with the progress of this CloudWorld lab, I encourage you to peruse my repo, which can be found here.

    Second, I’ve started to build out a python-themed library that will document code snippets and frequent code patterns when working with popular libraries + ORDS. The obvious ones are things like “requests” and “pandas”. And a lot of it is things that I have encountered personally. But I’m discovering there is so much more out there for python folks, so hopefully, this will continue to expand.

    If you do visit the “library” you’ll see that I have created some ORDS endpoints based on historical U.S. home sale data. You can check it out here.

    Current state:

    GitHub repo of python ORDS library code patterns, code snippets, Chris Hoina, Database Tools, Oracle Autonomous Database
    Those endpoints are actually enabled, so you can work with them without having to authenticate. Now of course you could totally mess things up and POST or delete whatever...but lets try to avoid that. 

    But wait, there’s more

    You have both an invitation and a request.

    An invitation to follow along with both the lab progress and the python library project.

    And a request to

    • submit ideas and possible areas of improvement (or cool code patterns) for this CloudWorld application,
    • and submit any requests you might have for the python library that I’ve started to build-out

    This is all voluntary of course, but if you think any of this might benefit please do engage and share!

    Finally, I’ll be back to regularly-scheduled technical posts soon. In the meantime…

    Keep following

  • Python or Filtering in Queries: sorting JSON responses from an Oracle Database

    Python or Filtering in Queries: sorting JSON responses from an Oracle Database

    In a recent post, I discussed taking the JSON response from my REST-enabled database table (which looked like this):

    URI from ORDS REST API enabled table, Chris Hoina, ORDS, Database Tools, Oracle Autonomous Database

    And sorting those cities after I performed a GET request (using some python code). However, after a recent discussion and a suggestion from thatjeffsmith, I think I should share another way you can sort this data from the source.

    Background

    If you recall, the portion of my python code that:

    1. Performs a GET request
    2. Extracts the cities to a list, and
    3. Sorts them in ascending order,

    …looks like this:

    def city_choices_python(): 
        import requests
        import json 
        city_list = []
    
        URL = 'https://gf641ea24ecc468-dbmcdeebyface.adb.us-ashburn-1.oraclecloudapps.com/ords/pythondev/cities/'
       
        response = requests.get(URL)
        r = json.loads(response.text)
        for nestedArr in r['items']:
            city_list.append(nestedArr['city'])
        return sorted(city_list)

    As you can see, in this version the URI provided by ORDS remains untouched. I’m just using that simple “sorted(city_list)” function to return a list of sorted cities in ascending order (FYI: its the last line in this code block).

    Filtering in queries

    This approach worked fine in my case, however I am now…enlightened…

    Meaning, I am now aware that I have other options. It hadn’t really occurred to me that I could filter in queries at the time of the get request.

    Filtering is the process of limiting a collection resource by using a per-request dynamic filter definition across multiple page resources, where each page contains a subset of items found in the complete collection. Filtering enables efficient traversal of large collections.

    ORDS Docs

    What I like about this is that filtering in queries doesn’t require any changes to the the original URI. Meaning, you wouldn’t need to go back into (in my case I’m using…) Database Actions to modify anything. You simply include filters along with the URI.

    In my case it would look like something like this:

    https://gf641ea24ecc468-dbmcdeebyface.adb.us-ashburn-1.oraclecloudapps.com/ords/pythondev/cities/?q={"$orderby":{"city":"asc"}}

    Time-out!

    This will screw with your head, so listen up. Quotation marks (e.g. ”
    “) can’t be transmitted over the internet, so they need to be converted to the ASCII character-set.

    Quotations in HTML-5 will look like “%22” in a URL. So if you take that above URL copy it, and then paste it again, it may end up looking like this in your browser’s address bar:

    https://gf641ea24ecc468-dbmcdeebyface.adb.us-ashburn-1.oraclecloudapps.com/ords/pythondev/cities/?q={%22$orderby%22:{%22city%22:%22asc%22}}

    Where you’ll start out with UTF-8 encoding (quotation marks), at some point you’ll see ASCII encoding (which is a combination of the % and hexadecimal encoding).

    I only mention this because I noticed when I took the updated URL and copied it into my VS Code editor it ended up looking like this:

    def city_choices_ords(): 
        import requests
        import json 
        city_list = []
    
        URL = 'https://gf641ea24ecc468-dbmcdeebyface.adb.us-ashburn-1.oraclecloudapps.com/ords/pythondev/cities/?q={%22$orderby%22:{%22city%22:%22asc%22}}'
    
        response = requests.get(URL)
        r = json.loads(response.text)
        for nestedArr in r['items']:
            city_list.append(nestedArr['city'])
        return(city_list) 

    Don’t be alarmed, everything still works. But the query will look like this:

    ?q={%22$orderby%22:{%22city%22:%22asc%22}}

    What is this saying?

    The query above, requests that the cities table be ordered (“orderby”) in ascending (“asc”) order according to the “city” column. You can see this in action if you point your browser to it. It will display like this:

    URI with filtering in query from ORDS REST API enabled table, Chris Hoina, ORDS, Database Tools, Oracle Autonomous Database

    Ordering isn’t the only type of filtering you can do, the combinations are pretty much infinite…

    Levar Burton, in no way endorses ORDS, but if he did anything database or API-related, I like to think that he would, Chris Hoina, Senior Product Manager, Oracle, Database Tools, ORDS

    Take a look.

    Comparing performance

    One last thing…I conducted a very crude A/B performance test, comparing the two code versions.

    In short, there was no appreciable difference. I should remind that I’m working with a very small database object (a table with 2 columns and 18 cities). So if there is a difference one way or another, that difference is negligible.

    I still recorded the test for fun. And you’ll see across the ten requests (five for each version), the elapsed times were essentially the same. I know this isn’t statistically sound, but this was done to satisfy my own curiosity. Our documentation actually states that filtering is advantageous for traversing large objects, so this is probably a terrible example. And as requests scale, I have to image filtering in queries would begin to pull ahead.

    Nonetheless, see the results for yourself:

    The End

    Now you’ve got a couple options for sorting when working with a REST-enabled table (or other object). The nice thing is that setting up these endpoints with ORDS can still be accomplished with simple mouse clicks.

    Also, remember, sorting isn’t the only thing that can be accomplished when filtering in queries. There are TONs of other capabilities, too many to list here. But I would bookmark this page so you can reference later on.

    I don’t want to say you can filter everything, but it sure looks like it. And for a python newbie like myself, this might be a better approach moving forward.

    Funny story, the day I added that “sort()” function in my code was literally the same day (earlier that morning in fact) that I watched a “python tips” video, where sorting was mentioned. It was pure luck.

    Had it not been for that video, I would have been stuck for at least another day trying to figure out how to sort my list.

    Hey, at least now I know I won’t be restricted to what I can filter based on the python functions I do or don’t know. I can’t be the only one….right?

    Follow

    Be sure to follow along for more fun:

  • Working with WTForms and Oracle REST Database Services (ORDS) APIs

    Working with WTForms and Oracle REST Database Services (ORDS) APIs

    Welcome Back

    The title pretty much speaks for itself, and if you’ve been following along, you’d know that I’m working on a demo application that uses Flask (a web application microframework) connected to my Oracle Autonomous Database (ADB) via Oracle REST Data Services (ORDS) APIs.

    One of Flask’s strengths is that it allows you to extend functionality and allows for a good bit of customization. But…

    By default, Flask does not include a database abstraction layer, form validation or anything else where different libraries already exist that can handle that. Instead, Flask supports extensions to add such functionality to your application as if it was implemented in Flask itself.

    Flask Docs

    Why ADB and ORDS?

    I’m primarily using this specific configuration (with ORDS and ADB) for obvious reasons. But secondarily, I haven’t come across anything that actually references the Oracle ADB or ORDS (in the context of Flask, WTForms, and python of course).

    Most of the online tutorials I’ve seen and books I’ve read reference SQLlite or (in one case) MongoDB for the database. SQLite does seem like a good entry point, especially if you are putting together a proof of concept. But as things scale, I suspect you’ll reach diminishing performance returns with SQLite. In fact, the Flask documentation mentions exactly that:

    Python comes with built-in support for SQLite in the sqlite3 module. SQLite is convenient because it doesn’t require setting up a separate database server and is built-in to Python. However, if concurrent requests try to write to the database at the same time, they will slow down as each write happens sequentially. Small applications won’t notice this. Once you become big, you may want to switch to a different database.

    Flask Docs

    That last sentence is significant. I know ORDS can be set up with High Availability in mind, so why not take advantage of that from the beginning? Also, with my ADB, the effort to create a database schema and tables is very minimal (fun even).

    Regarding, MongoDB, obviously, I’m biased but I’m also disorganized. And I’m still a “technical toddler.” So having to download, install, and configure MongoDB Community Edition plus the MongoDB Compass GUI would end up being an unnecessary step for me (I actually ended up doing this for a previous tutorial).

    Regardless, I already have my OCI tenancy and my Autonomous Database, and I continue to get more and more comfortable with our ecosystem.

    Sign up for an Always Free OCI account, Oracle Cloud Infrastructure, Chris Hoina, Senior Product Manager, Database Tools, ORDS
    Also: You should sign up for an 
    Always Free OCI account.

    Moving on…I also want to use Database Actions, and the REST API features provided by ORDS. Not to mention, from everything I’ve seen in the Flask tutorials you have to first create database models before you can do anything really meaningful. I’m not entirely sure how to go about that (please check back in 2023).

    ORDS is straightforward – just give me the URI for the API endpoint and I’ll take it from there. If I need something additional we can modify the API accordingly.

    API GET Request response (via ORDS)

    Speaking of which, here is what the JSON looks like from my ORDS-powered API GET request:

    ORDS REST API response from Autonomous Database, Chris Hoina, Senior Product Manager, Database Tools, ORDS

    I won’t spend too much time explaining this image (you can read about it in detail, here), but I’m most interested in the city: (found under items). There are 18 cities, and I’ll need all of them for Flask. More specifically, I need them for WTForms.

    About WTForms

    I won’t give a crash course on WTForms, but bottom line it’s another way to extend the capabilities of Flask.

    WTForms is a flexible forms validation and rendering library for Python web development. It can work with whatever web framework and template engine you choose. It supports data validation, CSRF protection, internationalization (I18N), and more.

    WTForms Docs

    In my scenario, WTForms allows me to easily make and handle (as in routing) forms in the Flask application. More specifically, there is a SelectFields() form that I need to use (this is just how I initially designed the app, you can review it here).

    Choices

    The forms have fields. Here are details on the form I’m most interested in:

    WTForms Flask Python ORDS REST APIs Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools

    You’ll notice there are various parameters I can include, but all I care about are the choices[]. You can either hard (static) code in a list of choices, or (as I just successfully did as of this week), pass a list from an API GET request to the choices[] parameter.

    If you’ve looked at the documentation, then you’ve seen that choices[] can be”passed” in one of three ways:

    1. a list of (value, label) pairs (it can also be a list of only values, in which case the value is used as the label)
    2. a dictionary of {label: list} pairs defining groupings of options, or
    3. a function taking no argument, and returning either a list or a dictionary

    My approach used the first option. Actually, the values-only approach, in my case.

    The breakdown

    First, I’ll show the dropdown list; it will look like this in my application (FYI this actually works):

    Flask application in action

    I know that video is fast, but you should be able to see the dropdown list (at around the :15-:17 mark) and how it renders.

    This dropdown list is made possible from an initial GET request to my ORDS-enabled “Cities” table. And the code in its entirety (including the Flask bits), looks like this:

    Flask WTForms Python ORDS REST APIs Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools
    NOTE: I'm working off the assumption that you may have some familiarity with flask and WTForms. And also that your focus is the REST API - to - choices part of this problem. 

    By the numbers

    Okay, a quick overview:

    • Lines 27-29: This is pretty standard for Flask, no explanation is needed
    • Lines 31-33: Are where I build out the form:
      • You’ll see that city is the field (to select from)
      • The dropdown will have a heading that says, “Let’s begin by choosing a city” (you can see this in the video)
      • And the choices will be a list []
      • Submit on line 33 is just how you’d create the submit button
    • Lines 37-39: Are also pretty standard for any GET request using the Requests library in python
    NOTE: But here, you'll see that my URL is actually the URI provided by ORDS (remember, I REST-enabled my cities table - that is the "endpoint" we are seeing) 
    • Line 40: I make a list (it’s empty at this step, so sad)
    • Lines 41-43: This is the pièce de résistance.
      • Here we perform a loop through the items: in the JSON response, and since this is a nested array, you have to specify what array you want and what exactly in that array you want to pull out (in this case the “city”)
      • Line 43 was a suggestion by thatjeffsmith, to display the list in alphabetical order (I must confess, it does better)
    • Lines 45-48: These are pretty standard for Flask as well. You need to set city (in line 32) equal to the city_list so WTForms knows where the SelectField(choices=[]) comes from

    The Code

    With the brief explanation out of the way, here is the code. Please, help yourself:

    #For the GET request portion, you'll at a minimum need requests and json. If you are working in Flask already, then you know the rest. 
    import requests
    import json 
    from wtforms import SelectField, SubmitField
    from flask import Flask, json, jsonify, render_template, request
    from flask_wtf import FlaskForm
    
    #Creating the Flask instance.
    app = Flask(__name__)
    app.config['SECRET_KEY'] = 'password1234'
    
    #Creating the Form.
    class CitiesForm(FlaskForm):
      city = SelectField("Let's begin by choosing a city", choices=[])
      submit = SubmitField('Submit')
    
    #Routes, used for displaying the html on your web app.
    @app.route('/', methods = ['GET', 'POST'])
    def index():
    
    #This is the section where you perform the GET request to retrieve the JSON from your ORDS endpoint.
        URL = '[your ORDS endpoint]'
        response = requests.get(URL)
        r = json.loads(response.text)
    
    #Creating the list of cities; which will be populated shortly.
    city_list = []
    
    #Looping through the nested array (from the inital JSON / API response). 
        for nestedArr in r['items']:
          city_list.append(nestedArr['city'])
          city_list.sort()
    
    #Finishing touches so WTForms, and Flask know where to look. And also setting up how/where the html on your web application renders.
    
        city=city_list
        form = CitiesForm()
        form.city.choices = city_list 
        return render_template('index.html', form=form)

    Obviously, you’ll want to change up some of the names. But the ORDS pieces should remain pretty similar.

    Next up

    My next steps are to create a second and third dependent dropdown for the app. What I mean is that, based on a user’s city selection, that city would then be used as a query parameter for a subsequent GET request (to my ADB). And ORDS would respond with a list that includes restaurant information for the target city (everything else would be excluded).

    If you’ve made it this far and you have any questions, please send me a note.

    And if you think this is worth sharing, and that it might help others, please pass it around!

    Cash me outside

    As always, you can find me on:

  • Clean up a .CSV file with Regular Expressions, Pandas, and Python

    Clean up a .CSV file with Regular Expressions, Pandas, and Python

    Let us begin

    I log into Database Actions as my newly created “Python Developer” and navigate directly to the “Data Load” page (found under the Data Tools section of the Launchpad). I choose to “Load Data” from a “Local File.” I click “next,” click the pencil icon (see the arrow in the image), and navigate to the “File” tab. I scroll to the “RESTAURANTOPENDATE” column and see this:

    CSV file in data load, Database Actions, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    In a previous post (if you’re new, you can read the end-to-end process), I discussed how this time (“04:00:00+00” or “05:00:00+00”) wasn’t necessary for me. At that time, I used the “Find and Replace” function in Excel (I’m sure you can do the same with Numbers, Sheets, or Calc) to replace all occurrences of time with “” (i.e., nothing).

    But in the spirit of doing things in five days, when they could have taken but five minutes, I opted to see if I could achieve a similar result in python.

    Goal

    Create a python script that will allow you to remove part of a value in a .CSV file.

    WARNING: I thought this would be a simple task. I should have known better. My approach may not be ideal for your situation, but hopefully you’ll learn something. Or at the very least maybe you can bookmark this post, along with the resources (at the end of the post) I'm including for later use.

    Regular Expressions

    I am confident there is a way to achieve this goal with the .CSV library in python. There is probably a way to do this with python out of the box. I couldn’t figure it out.

    I’m also reasonably confident that my approach is on the verge of ridiculous. Nevertheless, Regular Expressions, and the Pandas library, in python are what worked for me.

    What are Regular Expressions?

    Good question. I still don’t know, but here is what I found on Wikipedia:

    “A regular expression (shortened as regex or regexp; also referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually, such patterns are used by string-searching algorithms for “find” or “find and replace” operations on strings, or for input validation. It is a technique developed in theoretical computer science and formal language theory.”

    That is so painful to read, but the way I understand it is that we use Regular Expressions in pattern-matching. Essentially you create a pattern and then tell your application (or script) to search for it. From there, you can include more code to perform more actions. In my case, those actions would be to find a pattern and replace that pattern with nothing.

    So what is the pattern?

    One of the benefits of having zero formal training in application development (in this case, computer science and formal language theory) is that occasionally, you might take an approach that, while unintuitive, works well enough.

    And after many, many hours of trial and error, parsing through Stack Overflow, reviewing hours of YouTube, reading pages of blogs, and occasional use of the “I’m Feeling Lucky” button on Google, it occurred to me that my pattern was:

    Pattern for Regular Expressions, Database Actions, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    After reviewing more documentation and other various resources (I have an entire section at the end), I more clearly identified a pattern:

    A more formal pattern for Regular Expressions, Database Actions, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    Coming up with a clearly-defined pattern helped inform me as to how I could create the following Regular Expression:

    '\s\d+:\d+:\d+[^a-c6]\d+$'

    I then did some more stuff, and that was it! Follow me for more…

    I’m kidding.

    Deep Dive

    I can’t teach you everything, because I’m probably only ahead of you by a week at this point. But I can explain, in the hopes that it will make sense. Let me continue with the above Regular Expression (again, you’ll want to spend some time in the resources section I’ve included to understand better how this all fits together).

    But the above Regular Expression can be defined like this:

    Regular Expression definition for this script, Database Actions, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    I then took the above and plugged it into this nine (12 with sections, slightly more with comments) line script:

    from pickle import NONE, TRUE
    #This doesn't seem work without pickle, still not sure why
    import pandas as pd
    #You need this to do anything related to dataframes (i.e. "df"); 
    # if you're a data scientist and you use python, you may already 
    # be familiar with this
    import re 
    #You need this for Regular Expressions to work
    
    d = '/Users/choina/Documents/untitled folder/Restaurants_in_Wake_County.csv'
    #I'm assigning the file to "d"
    df = pd.read_csv(d)
    #df means dataframe. Here I'm telling panda to read the .CSV file 
    # that I just created ("d"), and consider that the "dataframe".
    
    print(df.dtypes)
    #This is optional, but when I do this I can see what the various 
    # datatypes are (according to pandas). The RESTAURANTOPENDATE column 
    # is an 'object'.
    for RESTAURANTOPENDATE in df:
        df['RESTAURANTOPENDATE'] = df['RESTAURANTOPENDATE'].str.replace('\s\d+:\d+:\d+[^a-c6]\d+$', '', regex=TRUE)
    #This is a "For Loop" that says set the RESTAURANTOPENDATE column 
    # contents equal to the following: FIRST, consider the values in the 
    # column as string and replace the contents using this process: 
    # find a pattern that matches this Regular Expression, replace
    # it with NOTHING, then take that column,along with the other 
    # columns and...
    df.to_csv('newrest.csv')
    #Save it to a new .CSV file called "newrest.csv"

    Code as an image for reference:

    Python script to clean CSV file for data load, Database Actions, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    In short, (I’ve left detailed comments for reference), these nine lines of code search for a pattern and replace it with nothing when found. We then take these results (including the rest of the .CSV file) and save them to a new file called “newrest.csv.”

    Please, reference this script. Tweak it as needed. Or drop me a note if you need an outside perspective. Good luck!

    One more thing

    Pickle, Pandas, and RE are all required for this script to work.

    Note: While I understand Pickle is a library that deals with the serialization of objects in python. I've no idea what that means, and reading about it makes my brain melt. Additionally, I’m not sure if this is expected behavior, but when I ran an earlier version of this script, this "Pickles" library just appeared at the very top. If this has happened to you, please leave a comment, because I'm scared there may be an actual ghost in my machine.

    Moving on

    The rest is pretty straightforward.

    I then went back into Database Actions (it had been so long that I’d expected a new version had been released already), loaded my data, and inspected the column in question:

    Updated CSV file in data load, Database Actions, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    Looks good to me!

    Magic

    Something I noticed was that my Autonomous Database automatically recognizes the “RESTAURANTOPENDATE” column as a data type of “Date.” I didn’t have to do anything special here, and I thought this was pretty cool. Take a look:

    Database Actions automatically recognizes datatypes Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Actions

    Addendum to the addendum

    Once imported, you can also use Regular Expressions in your database (with SQL). But if you’d like to clean your data before it reaches your ADB, the approach I’m writing about here is also a great option!

    Resources

    As promised, here are all the resources I’ve curated that might assist you should you choose to take the road less traveled…

    • Important/relevant documentation:
    • Stack Oveflow – this is a series of “bookmarked” questions about Python and Regular Expressions, that helped me. They might be of use. They aren’t intended to teach you, but rather aide you in gaining a better perspective.
    • YouTube – a playlist I created that might be worth reviewing (at a minimum watch Case Digital’s “How to Remove Characters From a Pandas Dataframe In Python” video).
    • Regular Expression Editors:
      • Pythex (make sure you click the “Regular expression cheatsheet” button to expand/see a valuable quick reference – also, apparently a more complete version can be found here)
      • Regular Expressions 101 – this includes an enormous library of previously created Regular Expressions. There is a “debugger” too!

    An even simpler solution

    I discovered an even simpler solution to this problem. As I was reviewing the Pandas documentation I noticed there was functionality built directly into Pandas. This can be done with the “datatime” function.

    And you can take it one step further with what is referred to as a “.dt accessor.” (you’ll see this in the updated code I’m including). This accessor allows you to manipulate a Series (aka a column) within a Pandas dataframe (df).

    There are quite a few time-date properties you can manipulate in Pandas, all of which can be found here. The “Time Zone Operations” section is where the “.dt accessor”, is briefly mentioned. For more context, I’d recommend reviewing this section on the .dt accessor and how they interplay with a Pandas Series.

    Don’t stop there though, the main python documentation discusses the basics of date and time operations. Without further ado, on to the new code:

    import pandas as pd 
    import numpy as np 
    
    #You don't have to do it like this. You could have a single "d" and then just comment/uncomment each line as needed. 
    
    d1 = '/Users/choina/Downloads/Food_Inspections.csv'
    d2 = '/Users/choina/Downloads/Food_Inspection_Violations copy.csv'
    d3 = '/Users/choina/Downloads/Restaurants_in_Wake_County copy.csv'
    
    #The same thing here, you could just have a single "d" and then just comment out the other two, when not needed.
    
    df_1 = pd.read_csv(d1)
    df_2 = pd.read_csv(d2)
    df_3 = pd.read_csv(d3)
    
    #Same goes for this too. This is either very slick of me, or incredibly lazy. 
    
    df_1['DATE_'] = pd.to_datetime(df_1['DATE_']).dt.date
    df_2['INSPECTDATE'] = pd.to_datetime(df_2['INSPECTDATE']).dt.date
    df_3['RESTAURANTOPENDATE'] = pd.to_datetime(df_3['RESTAURANTOPENDATE']).dt.date
    
    #Same, same here as well. You could do one at a time. But it works, I double-checked my work. 
    
    df_1.to_csv('newninspect.csv')
    df_2.to_csv('newviolate.csv')
    df_3.to_csv('newrest.csv')
    New python script to change date time in csv files, Chris Hoina, Senior Product Manager, ORDS, Database Tools, Oracle Autonomous Database

    About the update

    This code is certified fresh – meaning, it works. I triple-checked. The only thing I’ve noticed is that lines 12-14 must-have “.dt.date” at the end. From what I am reading (and am inferring from the documentation), it looks as though you need to first “treat” the Series with the “to_datatime” function. After that, the entire dataframe is in limbo (not quite written out to a new .CSV), waiting. Before the next step, we can strip the time portion out using the .dt accessor (i.e. the “.dt.date” portion located at the end of lines 12-14).

    From there it is simply a matter of writing out these updated dataframes to the three new expectant .CSV files.

    Find Me

    Please come back for more. I’m going to be working on views in my Autonomous Database, and then later, I’ll REST-enable them with the help of ORDS!

  • GET requests with Python, ORDS, and the Oracle Autonomous Database

    GET requests with Python, ORDS, and the Oracle Autonomous Database

    Problem

    I wanted to take the Objects,

    Objects in SQL Worksheet in Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,
    Objects

    In my Admin schema and copy them into a newly created Python Developer database user/schema. I have been meaning to do this for a while now since I’d to approach all future development work as a typical [developer] user might.

    I previously created a “developer” user, but the recommendation was to create a user specifically for python development. This recommendation makes sense because I’d like to explore other developer frameworks — JavaScript/React, Derelicte, Golang, Hoobastank, Swift, Smashmouth.

    DISCLAIMER: Some of the aforementioned languages/frameworks are entirely fictional.

    So in creating this user and populating the schema, I could have just copied these objects over in seconds. I’m sure some already existing would have enabled me to do this. A less-elegant option would be to do a local upload. I’m confident I could export the tables of my Admin schema and then upload them as a developer in Database Actions. Doing so here:

    Data Load in Database Actions Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools

    But those options were too straightforward, effortless, and expeditious. I needed an approach that would make me question my life decisions. Something that would force me to rethink what it meant to be human.

    Mostly, I wanted an excuse to tinker with ORDS some more. While I didn’t start with a well-defined use case, I was able to retcon (a portmanteau of the words retroactive and continuity) one.

    Scenario

    The scenario is that we have REST-enabled schema objects (in this case, various tables), and maybe we want to execute GET requests on that table. Perhaps we want all the data from that table (having been returned in JSON). And for whatever reason, we want to save this information in a JSON file. Plausible? Not likely. Possible? If I can do it, then yes, 100% possible.

    Solution

    NOTE: I've since disabled REST for these tables. So none of these URLs currently work. Also for security reasons you probably want to a require authentication to access these. The below screenshots and code snippets are for demonstration purposes.  In this case, probably acceptable, but please (PLEASE) exercise caution if you are recreating at your workplace.

    REST-enabling

    When you REST-enable a table, you’re provided a URL (aka a REST Endpoint) for viewing. It will look something like this:

    https://gf641ea24ecc468-dbmcdeebyface.adb.us-ashburn-1.oraclecloudapps.com/ords/admin/scores/

    The response looks like this:

    First 25 results via ORDS Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    You are seeing the first 25 results from my “Scores” table. I’ve also created an alias name for that table as an extra level of protection. The table’s actual name is not “Scores,” but rather something else.

    If you collapse the “items,” it’s easier to see the rest of the “name: value” pairs:

    Collapsed results via ORDS Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    I can do my best to explain what we see here:

    • ‘items: […]” are the individual entries in my table (the columns would be the items you see in the black bold). ORDS includes links for calling that single entry only
    • ‘hasMore”: true,” essentially means that there are more pages with information to follow
    • “limit: 25” refers to the fact that ORDS is only returning 25 results at a time. I didn’t set this; it was the default.
    • “offset: 0” this is tricky. From what I can tell, the offset only becomes a consideration after the first page. You begin with no offset (this makes no sense, I know, but I will explain this shortly)
    • “count: 25” – we have entries 1-25 on this page
    • “links: […]” these are all pretty self-explanatory. Two links, in particular, are pretty interesting. They are:
      • “describedby” and
      • “next”

    If I click the “describedby” link, you’ll see something that looks like this:

    Described by URL ORDS Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    On this page, you can review valuable information without ever logging into Database Actions. I found this extremely helpful; from here, I could see the table name, Primary Key (if it exists), and the column names and types for the table.

    Another link I wanted to highlight was “next.” If you look closely, you’ll see that the “next” URL includes an offset of zero:

    Initial 25 results via ORDS Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    However, if I click that URL (I’ve collapsed all the items on this page as well), I can see that “offset” has increased to 25, and the new “next” URL shows “offset=50”. This next part will do a better job of explaining what is happening.

    Next 25 results via ORDS Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    Since we are on the second page of results, ORDS knows that we began with an offset of zero (the beginning of the table) and a limit of 25. Each “next” URL will increment by 25 (the offset) and return 25 results at a time. And it will keep producing “next” URLs until the “hasMore: false” condition is finally satisfied. We can modify a lot of this, but this is what ORDS does for you automatically/by default. Great news for me because at my current level of experience, the more steps that I can automate, the better!

    Python

    Prepare yourself for a bit of a leap. At this point, I was pretty confident I could copy/paste my way through some python code and pull down this data, and then save it in some format.

    Finally, after about a week, I was able to duct-tape something together. Courtesy of Google, StackOverflow, so many YouTube video tutorials (so, so many), here is what I’ve arrived at:

    import requests
    import json 
    
    # requests is a python library that allows you to send GET, 
    # PUT, DELETE, HEAD, and OPTIONS requests. I'm just doing 
    # GET requests though. The json library comes with python, 
    # but I need it here to be able to work with the json that # is returned. I'll also need it to create the json file 
    # (at the end).
    
    offset = 0
    
    # I'm basically telling this little script that we should 
    # expect to start with an offset of zero. If you recall, 
    # this just means we are beginning on the first page. 
    # Hence, there is no offset. 
    
    url = 'https://gf641ea24ecc468-dbmcdeebyface.adb.us-ashburn-1.oraclecloudapps.com/ords/admin/places/?offset={}'
    
    # I've added in this modifier {}, which, as you'll see in a # bit, allows me to increment by 25 with each subsequent 
    # GET request.
    
    hasMore = True
    
    # Read this as "hasMore is equal to True". In other words, 
    # if there is more STUFF, then we should see "true" (go 
    # back and look at the screenshots if you don't know what 
    # this means.)
    
    while hasMore:
    
    # We are saying, "While 'hasMore is equal to true, then..."
    
    response = requests.get(url.format(offset))
        
    #...recognize that "response" is a GET request that 
    # consists of the URL (from above) plus whatever the offset # is (we'll indicate the offset below).
    
    data = response.json()
    
    #data is just whatever the results are from the above 
    # response.
    
    filename = 'places.json'
    
    # "filename" is the json file we are going to create to 
    # put all these results into.
    
    if data.get('hasMore'):
         offset += 25
         with open(filename, 'a') as json_file:
              json.dump(data, json_file)
    else:
         break 
    
    # I'm going to switch to narrating here, this one is just 
    # too verbose. In lay terms, we are saying: if the GET 
    # request "data" (which is essentially a composite variable # at this point) has "hasMore: true" on the page, then keep # doing this next thing. 
    
    # With the next "thing" being: take the results from that 
    # URL and save it into a file called "places.json". Once 
    # you've done that, keep doing it, but add 25 to the offset # for that URL for subsequent GET requests. And whenever 
    # you have results, append it (that's what that "a" is on 
    # line 44) to the file you created, and just keep going 
    # until "hasMore" doesn't satisfy that condition on line 
    # 20. 
    
    # If it doesn't satisfy that condition then that means 
    # "hasMore" is equal to false. In other words, there is 
    # nothing more to request, so stop (that is what the "else: # break" means). 

    I’m going to include an image to see the nesting better.

    python script for ORDS, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    And once I run this, I’m left with a JSON file (“places.json”) that has all my entries saved à la:

    json file output from ORDS, Oracle Autonomous Database, Chris Hoina, Senior Product Manager, ORDS, Database Tools,

    Now what?

    At this point, I have a few options. I could:

    • include code that would convert this JSON file to a . CSV file
    • include code to indent the JSON file, making it more readable, or
    • drag and drop it into my Autonomous Database (via Data Load in Database Actions) with some slight tweaking (I just tried, so I already know I need to do more massaging)

    The End

    So what have we learned here? Even somebody with very basic skills can interact with an Autonomous Database via ORDS.

    If you have an internet connection and are eager enough, the sky is the limit. Putting things into perspective, I wouldn’t have been able to do this about a week and a half ago.

    Truthfully, you’re not that far behind! I encourage you to give it a go.

    Find me