Major change to the ApiOpenStudio repository location

In order to implement pipelines and docker, with automated builds of docker images, the ApiOpenStudio projects have all been added to a new ApiOpenStudio group in GitLab.

This will enable GitLab pipelines to orchestrate pipelines across all of the projects as code is pushed and merged.

There was a dependency on this for upcoming tickets and tasks, so the tasks could not be delayed any longer. Because of this change, we have merged the develop branch to master branch, because this will update the wiki and phpdoc to reflect these changes.

However a new release tag for packagist has not been generated at this stage, becase we are only a few tasks away from beta release.

New changes available in the master branch:

  • GitLab CI pipelines now faster, (#118 – closed).
  • Wiki pages updated (#118 – closed & #115 – closed).
  • Fixed CI artefacts not being uploaded on failure (#117 – closed).
  • Logging now works on PHP8.0 as well as PHP7.4 (#111 – closed).
    • This involved deprecating Cascade, and creating a wrapper for the awesome Monolog package.
  • Implemented full JWT token authentication (#101 – closed).
  • Fix automated unit and functional tests (#110 – closed).
  • The entire project code has been updated to ensure all the latest PHPdoc and coding standards are passed.
  • Fixed Packagist for apiopenstudio_admin – sorry, this was my bad – it was a copy and paste error that went unnoticed.

Contributors and developers using the codebase

If you have a clone of the Gitlab repository, you will need to update your remote branch with the following command (assuming you have cloned with SSH):

git remote set-url origin git@gitlab.com:apiopenstudio/apiopenstudio.git

If you have a clone of the GitLab repository, you will need to update your remote branch with the following command (assuming you have cloned with SSH):

git remote set-url origin git@github.com:naala89/apiopenstudio.git

If you have forked the Gitlab repository, you can update the upstream URL:

git remote set-url upstream git@gitlab.com:apiopenstudio/apiopenstudio.git

The updated URLs

The new Group URL’s

The GitLab project URL’s

The GitHub mirror URL’s

Exciting upcoming features for the Beta release

  • Unit and Functional testing for PHP8.0 to ensure working across all contemporary PHP versions.
  • Composer 2.0 should be fine, but this should be tested before Beta release.
  • Swagger processor will be brought up to dat and fixed to allow importing and exporting of OpenApi documents.
  • Automated tagging and generation of an ApiOpenStudio Docker image

A Deep Dive into JWT Tokens in ApiOpenStudio Auth

Introduction

After a ton of work, over the next week or two we will be introducing JWT tokens for authentication in ApiOpenStudio.

This will:

  • Significantly increase the speed of resource requests.
  • Make individual transactions stateless.
  • Maintain the granular access rights to resources, based on the user’s access rights, the resource’s holding account & application, and of course the resource itself.
  • Ensure viability and ease for enterprise clients to use 3rd party authorisation services.

This article will take a look into the JWT specification, current practices and how it is used in ApiOpenStudio.

Rationale

While trying to optimise the authentication DB queries that are performed before the resource was processed, we came to the realisation that the queries were quite long and involved the joining of multiple tables and had several sub-queries so that the query could take into account our extensive range of access rights, like Administrators, Account managers, Applications Managers, Developers and Consumers. Although this would obviously be faster in a production environment, this was causing 1-2 seconds additional processing time to calls in development environments…

Because this had to be calculated every time a resource was called!

The solution was to introduce stateless authorisation, in the form of JWT tokens.

JWT is rapidly becoming the industry standard for authentication and JWT tokens have what are called Claims, which are individual name/value pairs within the body of the token and the token itself is encrypted and secure, which means that sensitive data can be securely included in the token and thus means that the user’s roles and permissions can be included as a claim and only need to be fetched once (during the JWT token GET call).

In addition, JWT tokens have a TTL, which means that we do not need to store a bearer token in the DB against each user (use that to fetch the user that each time a request is received) – because if the token is valid, then there is no need to fetch the user and check the bearer token TTL.

Here are some scenarios where JSON Web Tokens are useful:

Authorisation: This is the most common scenario for using JWT. Once the user is logged in, each subsequent request will include the JWT, allowing the user to access routes, services, and resources that are permitted with that token. Single Sign On is a feature that widely uses JWT nowadays, because of its small overhead and its ability to be easily used across different domains.

Information Exchange: JSON Web Tokens are a good way of securely transmitting information between parties. Because JWTs can be signed—for example, using public/private key pairs—you can be sure the senders are who they say they are. Additionally, as the signature is calculated using the header and the payload, you can also verify that the content hasn’t been tampered with.

auth0.com, JSON Web Token Introduction – jwt.io, viewed 1 September 2021, https://jwt.io/introduction.

How does JWT work?

Pronunciation

Before we start, for a bit of fun I’d like to set the record straight. I’m not one of those boring “absolutists” who insist that GIF should be pronounced “JIF”, even though the original creator of the tech obviously could not spell when he declared that was the pronunciation. But many people are driving me bananas, by mispronouncing JWT: just pronounce it as “JOT”, it’s also easier to say than the most common variant “Jay-Dubbyah-Tee” (this is not a relative to a former US president).

The suggested pronunciation of JWT is the same as the English word “jot”.

Jones M, Microsoft, Bradley J, Ping Identity, Sakimura N, NRI 2020, JSON Web Token (JWT), Internet Engineering Task Force (IETF), viewed 1 September 2021, https://datatracker.ietf.org/doc/html/rfc7519#section-1.

JWT overview

Now for the technical stuff…

JWTs represent a set of claims as a JSON object that is encoded in a
JWS and/or JWE structure. This JSON object is the JWT Claims Set.
As per Section 4 of RFC 7159 [RFC7159], the JSON object consists of
zero or more name/value pairs (or members), where the names are
strings and the values are arbitrary JSON values. These members are
the claims represented by the JWT. This JSON object MAY contain
whitespace and/or line breaks before or after any JSON values or
structural characters, in accordance with Section 2 of RFC 7159
[RFC7159].


The member names within the JWT Claims Set are referred to as Claim
Names. The corresponding values are referred to as Claim Values.
The contents of the JOSE Header describe the cryptographic operations
applied to the JWT Claims Set. If the JOSE Header is for a JWS, the
JWT is represented as a JWS and the claims are digitally signed or
MACed, with the JWT Claims Set being the JWS Payload. If the JOSE
Header is for a JWE, the JWT is represented as a JWE and the claims
are encrypted, with the JWT Claims Set being the plaintext encrypted
by the JWE. A JWT may be enclosed in another JWE or JWS structure to
create a Nested JWT, enabling nested signing and encryption to be
performed.


A JWT is represented as a sequence of URL-safe parts separated by
period (‘.’) characters. Each part contains a base64url-encoded
value. The number of parts in the JWT is dependent upon the
representation of the resulting JWS using the JWS Compact
Serialization or JWE using the JWE Compact Serialization.

Jones M, Microsoft, Bradley J, Ping Identity, Sakimura N, NRI 2020, JSON Web Token (JWT), Internet Engineering Task Force (IETF), viewed 1 September 2021, https://datatracker.ietf.org/doc/html/rfc7519#section-3

What does this mean?

A token is comprised of 3 parts, the header, payload and signature.

The header of the token defines the cryptography applied to the payload, and the payload is a JSON structure, that is encoded in JWS (Base64 encoded) or JWE (encrypted).

JWS is less secure since it is only a Base64 encoded JSON, and this means it is not suitable in our case. Because the token will carry authentication data for access to the resource and Base64 encoding is different to encryption – it is relatively trivial to decode.

JWE is more suitable to our case, because the JSON payload is encrypted (as defined in the header), and the JWT token keys are stored securely on the on the ApiOpenStudio server, so only the API server can decrypt the body.

There are multiple encryption standards available, and in our case we are using the fantastic lcobucci-jwt library, which is sponsored by one of the leading Authorisation services: Auth0. This provides support for many, many symmetric and asymmetric algorithms.

We have not implemented symmetric algorithms (these are less secure and are the same key for encryption and decryption), so with asymmetric algorithms, the public key can be used by any clients, and the public/private keys are stored in a secure location on the ApiOpenStudio server.

JWT token structure

In its compact form, JSON Web Tokens consist of three parts separated by dots (.), which are:

  • Header
  • Payload
  • Signature

Therefore, a JWT typically looks like the following.

xxxxx.yyyyy.zzzzz

Header

The header typically consists of two parts: the type of the token, which is JWT, and the signing algorithm being used, such as HMAC SHA256 or RSA.

For example:

{
  "alg": "RSA256",
  "typ": "JWT"
}

Then, this JSON is Base64Url encoded to form the first part of the JWT.

Payload

The second part of the token is the payload, which contains the claims. There are three types of claims: registeredpublic, and private claims.

  • Registered claims: These are a set of predefined claims which are not mandatory but recommended, to provide a set of useful, interoperable claims. Some of them are: iss (issuer), exp (expiration time), sub (subject), aud (audience). Notice that the claim names are only three characters long as JWT is meant to be compact.
  • Public claims: These can be defined at will by those using JWTs. But to avoid collisions they should be defined in the IANA JSON Web Token Registry or be defined as a URI that contains a collision resistant namespace.
  • Private claims: These are the custom claims created to share information between parties that agree on using them and are neither registered or public claims.

An example payload could be:

{
  "iss": "my.apiopenstudio.com",
  "sub": "1234567890",
  "name": "John Dory",
  "admin": true
}

The payload is then Base64Url encoded to form the second part of the JSON Web Token.

Signature

To create the signature part you have to take the encoded header, the encoded payload, a secret, the algorithm specified in the header, and sign that.

The signature is used to verify the message wasn’t changed along the way, and in the case of tokens signed with a private key, it can also verify that the sender of the JWT is who it says it is.

Using JWT tokens

The final xxxxx.yyyyy.zzzzz token is sent in the request as a bearer token in the request header, e.g.:

Authorization: Bearer <token>

When the request is received by the API server, it will first confirm that the token is valid, checking it can be decrypted, the issuer, expiry date and mandatory claims.

If this all passes, the processing can continue, otherwise a 401 error response is sent and the client will need to generate a new token, using the provided core token request (auth/token) and resend the request with the token that was received.

From a processes POV, this is exactly the same as before. However, as mentioned, we are using asymmetric encoding, so we can include data in the token payload that means that ApiOpenStudio does not need too fetch user data in order to validate the user’s permissions against the resource. That only needs to be fetched once – when the token is generated.

Minor issue (caveat)

The original authorisation tokens were stateful, i.e. the token was stored against the user, along with the TTL for the token. This meant that if a user was banned, deleted or made inactive, they would instantly not be able to make any further API requests, due to either the users not being present/active anymore, or the token was no longer valid.

JWT tokens are stateless. This means that if a user has a valid token, they can still use it until it expires regardless of whether they are made inactive or deleted (they would only be prevented from fetching a fresh token).

This can be mitigated by setting a global jwt_life of less than 1 hour (the default), however this needs to be balanced against the increase in requests, since a lower token TTL will lead to more requests due to tokens frequently passing their expiry date more frequently.

Another mitigation for extreme cases can be to add the IP address of the client to the blacklist – this will prevent all future calls from that location, immediately.

3rd party authorisation integration

Because the token requires certain client data to be present, the user details and roles will need to be accessible by the authorisation provider, so that they can generate a valid body. Thankfully, most reputable providers allow you to upload these details to your account with them, and also provide ways to ensure that these details are always current and up to date.

You will need to ensure that the following mandatory claims:

  • iss – JWT issuer (your auth provider)
  • aud – permitted for (your api)
  • iat – JWT issued time
  • exp – JWT expiry time

The following ctsom claims are also included

  • uid – user ID
  • roles – complete list of roles and accounts/applications that the user is associated with

The roles object

This is in the JSON object format of:

roles: [
    {
        "role_name": <role_machine_name>,
        "accid": <account_id>,
        "appid": <application_id>
    }
]

For example:

roles: [
    {
        "role_name": "administrator",
        "accid": null,
        "appid": null
    },
    {
        "role_name": "consumer",
        "accid": 34,
        "appid": 5
    },
    {
        "role_name": "developer",
        "accid": 34,
        "appid": 5
    },
    ....
]

Note that:

  • “administrator” does not require accid or appid
  • “account_manager” does not require appid

Summary

We’re really excited to be implementing this technology, seeing the decrease in resource processing time, increasing the security of the API, and making it even easier for enterprise scale users to implement ApiOpenStudio on a large scale.

I’ll be honest with you too, it was really good fun to implement, and we totally got a total nerd-on, doing the research and coding for this!

Increased security and speed with JWT tokens

Current dev work is almost complete for implementing authorisation with JWT token for all resources! This will be part of the upcoming Beta release.

The ticket can be viewed in Gitlab.

This will replace the existing alpha version of a custom token and token TTL for each user in the user table.

It is quite important to note, before we move on, that JWT tokens are a different thing to oauth2, implicit grant, explicit grant, application grant and PKSE authorisation flow. JWT is only a standard for tokens. If you need to implement oauth2 or other similar workflows this is separate from the JWT implementation.

The problem

The problem with the former approach, was that resource requests had to make DB calls to the user, user_roles, roles, account and application tables in order to verify user permissions to that particular resource, FOR EVERY API CALL. This obviously negatively impacted performance for API calls.

This also meant that authorisation was not easily scalable to authorisation servers for enterprise implementation, because the implementation of the token and authorisation for API calls was tightly coupled to the ApiOpenStudio database and several of its tables.

The solution

Although the former approach was stateful (it maintained login state, so users could login and out), the stateless JWT token approach means that the token does not need to be stored in the database. The downside of stateless JWT tokens, is that there is no logout state. So if a user’s access is revoked, they will still have access to resources until their current token goes stale.

However, this can be mitigated by making the JWT token lifetime short in the ApiOpenStudio configuration.

Each JWT token contains custom claims for user ID and all roles that that user has. So when the initial request is received by ApiOpenStudio, it just decrypts the token and validates the user’s roles against the resources account/application and permissible user roles (i.e. Does the current user have the required role access to the account & application?).

Knock-on effects

The following processors have been retired:

  • user_login.
  • user_logout.

Nearly all core resources have been updated use the new processors:

  • generate_token (generate a valid JWT token for a user, with custom claims: uid, user roles).
  • validate_token (validate the Authorization token as a valild JWT token).
  • validate_token_roles ((validate the Authorization token as a valild JWT token and also validation the user has the correct role permissions for the resource).
  • bearer_token (not used by core atm, but preserved for any processors that need access to the bearer token).

Processors have been optimised, now that they do not need to do any pre-validation on who can do what – this is left to the core resource definitions.

Tests are updated to incorporate the changes, and also now have multiple test users with different roles.

The good news

Not only has this significantly improved the API response time, it has now made the API much more scalable for enterprise. We communicated and researched several major 3rd party authorisation services, including auth0, to make sure that the decision to move to JWT tokens and custom claims would still be viable if a 3rd party auth server was used.

Most 3rd party authorisation services implement linking into external databases, so that would take the heat off the api server for token generation, and allow the token generation to be completely decoupled from ApiOpenStudio. This will be the subject of a future post.

Joining the API economy

We’ve all heard about the API economy and the extra revenue it can provide while increasing the network and visibility of the business. We will be discussing the processes and advice for how you would actually join the API economy.

Types of API’s

There are basically two areas of API’s:

  • Internal API’s that are never exposed to the outside world, and are generally intended for a micro-service architecture. The benefits and challenges of this will be discussed in a separate post.
  • Externally exposed API’s that offer data and services to 3rd parties. These can either be free or paid.

This post will deal with externally exposed API’s. Purely internal API’s are not strictly part of the API economy, these are services within the company.

Moving into the API economy

The decision to move into the API economy might require a cultural shift within your business, and one that can be that would be very beneficial. It is primarily a business decision, rather than being left solely to the IT department to find ways of using the data that they have collected for the benefit of the business. This is a good thing! It requires all of the business to get together and decide on what data they want to share, is there already enough data to share, what extra data and metrics need to be collected, how will this be collected, does the data need to be changed. etc.

Approach

I would recommend taking a top-down approach to this, rather than launching your IT dept into coding your great idea. The planning of this is very much a business decision, and each department should be involved at nearly every stage, as you move from project inception to meetings and discussions of potential merits of the plan and ideas this will spawn, through to final planning and execution.

This might require a cultural change in your departments, as the different departments start to think about what assets they have or can create to be added to the API suite. They will probably find that they need to change processes and approaches in order to fully embrace this.

REST APIs

Defining what a REST API can do is a separate topic for another post. But essentially, it is built on the rather convenient request types in a HTML request:

  • POST
  • GET
  • PUSH/PUT
  • DELETE

These allow for Create, Read, Update and Delete requests to be made over the API. If you want to impress your IT team, the acronym for this is CRUD. Thus, you can merely Read (i.e. GET) data or you can also Create (POST), Update (Push or Put) and Delete (DELETE) data.

GraphQL APIs

Defining what a GraphQL API can do is another separate topic for a post. But essentially, it is addresses one of the shortcomings of the REST structure: meta-links.

REST has a shortcoming in that you cannot specify data selection parameters and related items in the same request without a custom attributes in the query. So this leads to multiple round trips and requests, e.g. fetch all posts, then the for each post. Each of these items would then contain links for subsequent requests to fetch each things like comments or taxonomy terms for each post. This can significantly increase the data loading time.

GraphQL addresses this problem by allowing an API request to include data structure and request elements in it. Thus, you can fetch your data in one request.

Commercial benefits

Commercial benefits should be made to either make the API’s free or only accessed through a payment gateway and account access to the API’s. Once that is decided upon, Security and volume loads need to be considered. With the explosion of free and commercially driven API’s along with the massive increase in Javascript frameworks and headless architecture, traffic for the could potentially be high, so provision will have to made in the server architecture to be scalable. This is a huge topic for a separate post.

Thought should be made into what service you are providing to 3rd parties and customers:

  • What benefits will they get from these new data and service endpoints?
  • How easy will it be to use and access?
  • What will the format of the data be?
  • Will the customers require any customisation and tailoring to their needs of the services? For instance, Uber’s custom requirements from Google maps API’s
  • Is there a business model for customisation, etc?

If access to the API is going to be limited to paying customers or selected 3rd parties then access control needs to be implemented. This is where ApiOpenStudio and some other API frameworks come into their own. You can define users, departmental and account roles for individual users or groups and then define what access rights these roles have to individual API resources. Perhaps you only want to give a 3rd party Read access to specific data, whilst giving one of your departments full Create/Read/Update/Delete access to all or a subset of the data. Maybe your API model wants to enable a 3rd party or department the ability to control their own silo’d data – so that data would be private to them, but they would have Create/Read/Update/Delete to their own data and only they would have access to it over the API’s (with the exception of you monitoring the data for security, API request rates and data volume control).

Creating your APIs

Before you dive straight into creation of the API’s, you should also consider the API’s from the user’s viewpoint. How easy will they be to use, do they provide data in the format that is most easy for me to consume, how will I discover these resources, i there any benefit for me to create code to consume the API’s, what other competitive resources are out there, are they better?

Once you have decided on the basic API model that you want to provide, you can start getting down to the nitty gritty of defining each resource and what it will do. ApiOpenStudio, and paid-for-services like MuleSoft will allow you to import API resource definitions from Swagger. If the API resources need processing logic on the data before final delivery, this should be defined and created. This is very simple in ApiOpenStudio, it is designed specifically to make this quick and easy. Meaning you do not need to employ expensive developers who are experts in a specific coding language to implement them (which can also be a time costly exercise).

Once you are ready to go, you need to pay specific attention the marketing of the new API suite. If you just put it out there and wait for the customers to come in, it is almost certainly going to fail. It is very important to put thought into how you will let people and companies know about the API. Maybe an email blast to your customers, creation of a specific website for the suite to expose it to the public, blogging, getting listed in aggregate listings of API’s, etc.

Alpha release is nearly there!

Alpha release is so close we can touch it! We are getting very excited about this now!

This is the final countdown, with only a couple of weeks left:

  • This site is nearing completion
  • The ApiOpenStudio has gone through a major refactor after the pre-release branding change.
  • We have about 90% of my Functional tests reviewed and working.
  • Domains and servers are setup and working nicely
  • The wiki has been worked into and worked into and is looking nice
  • The PHPDoc is performing nicely
  • GitLab pipelines is now pretty much working as we want it to:
    • Linting on all merges and PR’s
    • Compiling and deploying the wiki to dev and prod
    • Compiling and deploying the PHPDoc to dev and prod
  • The GitHub mirror has been set up

I’m disappointed that I’ve so far been unsuccessful in integrating the Codeception tests with GitLab CI. I really wanted that ready for the initial release. However, since this is Beta release, I’ll aim to get that ready before the Alpha release.

See /roadmap for more details of the ongoing work.

WP Twitter Auto Publish Powered By : XYZScripts.com
Social media & sharing icons powered by UltimatelySocial
RSS
LinkedIn
Share
WhatsApp