Introduction

Architecture

Design

Light Portal is an application that connect the providers to the consumers, and it contains many components or applications. Each component will have some API endpoints and a user interface in the portal view single page application.

To allow the users to understand each component in detail in term of design, we have collected all the design documents in this section.

Multi-Tenant

Database Schema

We're facing a classic multi-tenancy database design decision. Adding a host_id to every table is one approach, but it does lead to composite primary keys and can impact performance. Using UUIDs as primary keys, even in a multi-tenant environment, is another viable option with its own set of trade-offs. Let's examine both strategies:

Host ID on Every Table (Composite Primary Keys)

Schema: Each table would have a host_id column, and the primary key would be a combination of host_id and another unique identifier (e.g., user_id, endpoint_id).

CREATE TABLE user_t (
    host_id UUID NOT NULL,  -- References hosts table
    user_id INT NOT NULL, 
    -- ... other columns
    PRIMARY KEY (host_id, user_id),
    FOREIGN KEY (host_id) REFERENCES hosts_t(host_id)
);

Pros:

Data Isolation: Clear separation of data at the database level. Easy to query data for a specific tenant.
Backup/Restore: Simplified backup and restore procedures for individual tenants.

Cons:

Composite Primary Keys: Can lead to more complex queries, especially joins, as you always need to include the host_id. Can affect query optimizer performance.
Storage Overhead: host_id is repeated in every row of every table, adding storage overhead.
Index Impact: Composite indexes can sometimes be less efficient than single-column indexes.

UUIDs as Primary Keys (Shared Tables)

Schema: Tables use UUIDs as primary keys. A separate table (tenant_resources_t) maps UUIDs to tenants.

CREATE TABLE user_t (
    user_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    -- ... other columns
);


CREATE TABLE tenant_resource_t(
    host_id UUID NOT NULL,
    resource_type varchar(255) NOT NULL, --e.g., 'user', 'api_endpoint'
    resource_id UUID NOT NULL,
    PRIMARY KEY(host_id, resource_type, resource_id),
    FOREIGN KEY (host_id) REFERENCES hosts_t(host_id)
);

Pros:

Simplified Primary Keys: Easier to manage single-column UUID primary keys. Simpler joins.
Reduced Storage Overhead: No need to repeat host_id in every table.
Application Logic: Multi-tenancy is handled mostly in the application logic by querying tenant_resources_t to ensure a user belongs to the correct tenant, adding a layer of flexibility. (This is also a con if not carefully implemented.)

Cons:

Data Isolation (slightly reduced): Data is logically separated but resides in shared tables. Robust application logic is essential to prevent data leakage between tenants.
Backup/Restore (more complex): Backing up/restoring for a single tenant requires filtering based on the tenant_resources_t table.
Query Performance (potential issue): Queries might require joining with tenant_resources_t table which will add a bit overhead. Proper indexing and query optimization become crucial.

Hybrid Approach (Recommended in many cases)

Combine the strengths of both approaches.
Use UUIDs as primary keys for most tables for simplicity.
Add host_id to tables where data isolation is paramount (e.g., tables containing sensitive financial or personal data) or where frequent tenant-specific queries are performed. This provides a more granular level of control and allows for optimized queries when needed.
Use the tenant_resources_t table to maintain an overview of which resources belong to which tenant, supporting the application logic's multi-tenancy enforcement.

Choosing the Right Strategy:

Small Number of Tenants, Low Data Volume: UUIDs might be sufficient.
Strict Data Isolation Requirements, Frequent Tenant-Specific Queries, High Data Volume: host_id on every relevant table or a hybrid approach is often preferable.
Flexibility in Multi-Tenancy Implementation: Favors the UUID approach or Hybrid approach, with logic predominantly handled in the application layer.

Additional Considerations:

Database Support: Check if your database (PostgreSQL) has specific features for multi-tenancy.
Indexing: Carefully plan your indexes for optimal performance in any multi-tenant scenario.
Schema Design: Consider creating views for tenant-specific data access to simplify queries in the application layer.
Security: Implement robust security measures to prevent data leakage between tenants, regardless of the chosen approach.

The hybrid approach usually provides the best balance between performance, data isolation, and development complexity in real-world multi-tenant applications. You have more direct control where strict isolation is required and can maintain the simplicity of UUIDs where it's beneficial. Using tenant_resources_t for application logic enforcement offers flexibility and a central point of management for tenant resource association. This approach also prepares you for potential future expansion and different multi-tenancy demands as your application grows.

Citus PostgreSQL Extension

Citus, now fully integrated into PostgreSQL as a distributed database extension, can be very helpful in scaling your multi-tenant application, especially if you anticipate significant data growth and high query loads. Here's how Citus can fit into your use case and the factors to consider:

How Citus Helps:

Horizontal Scalability: Citus allows you to distribute your data across multiple PostgreSQL nodes (servers), enabling horizontal scaling. This is crucial for handling increasing data volumes and query loads in a multi-tenant environment.
Improved Query Performance: By distributing data and queries, Citus can significantly improve the performance of many types of queries, especially analytical queries that operate on large datasets. This is particularly beneficial if you have tenants with substantially different data volumes or query patterns.
Shard Placement by Tenant: One of the most effective ways to use Citus for multi-tenancy is to shard your data by host_id (or a tenant ID). This means that all data for a given tenant resides on the same shard (a subset of the distributed database). This allows for efficient tenant isolation and simplifies queries for tenant-specific data.
Simplified Multi-Tenant Queries: When sharding by tenant, queries that filter by host_id become very efficient because Citus can route them directly to the appropriate shard. This eliminates the need for expensive scans across the entire database.
Flexibility: Citus supports various sharding strategies, allowing you to choose the best approach for your data and query patterns. You can even use a hybrid approach, distributing some tables while keeping others replicated across all nodes for faster access to shared data.

Example (Sharding by Tenant):

Create a distributed table: When creating your tables (e.g., user_t, api_endpoint_t, etc.), you would declare them as distributed tables in Citus, using the host_id as the distribution column:

CREATE TABLE user_t (
    host_id UUID NOT NULL,
    user_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    -- ... other columns
) DISTRIBUTE BY HASH (host_id);

Querying: When querying data for a specific tenant, include the host_id in your WHERE clause:

SELECT * FROM users_t WHERE host_id = 'your-tenant-id';

Citus will automatically route this query to the shard containing the data for that tenant, resulting in much faster query execution.

Citus Cost:

Citus Open Source: The Citus open-source extension is free to use and is included in the PostgreSQL distribution. You can self-host and manage it.
Azure CosmosDB for PostgreSQL (Managed Citus): Microsoft offers a fully managed cloud service called Azure CosmosDB for PostgreSQL, which is built on Citus. This service has usage-based pricing, and the cost depends on factors like the number of nodes, storage, and compute resources used. This managed option reduces the operational overhead of managing Citus yourself.

Recommendation:

Don't automatically add host_id to every table just because you're using Citus. Carefully analyze your data model, query patterns, and multi-tenancy requirements.

Distribute tables by host_id (tenant ID) when data locality and isolation are paramount, and you want to optimize tenant-specific queries.
Consider replicating smaller, frequently joined tables to avoid unnecessary joins and host_id overhead.
Use a central mapping table (tenant_resources_t) to manage tenant-resource associations and enforce multi-tenancy rules in your application logic where appropriate.

This more nuanced approach provides a balance between the benefits of distributed data with Citus and avoiding unnecessary complexity or performance overhead from overusing host_id. Choose the Citus deployment model (self-hosted open source or managed cloud service) that best suits your needs and budget.

Primary Key Considerations in a Distributed Citus Environment

When a table includes host_id (due to sharding requirements), it is important to include host_id as part of the primary key. This ensures proper functioning and optimization within the Citus distributed database.

Distribution Column Requirement
In Citus, the distribution column (e.g., host_id) must be part of the primary key. This is essential for routing queries and distributing data correctly across shards.
Uniqueness Enforcement
- The primary key enforces uniqueness across the entire distributed database.
- For example, if user_id is unique only within a tenant (host), then (host_id, user_id) is required as the primary key to ensure uniqueness across all shards.
Data Locality and Co-location
Including host_id in the primary key ensures that all rows for the same tenant (identified by the same host_id) are stored together on a single shard. This provides:
- Efficient Joins: Joins between tables related to the same tenant can be performed locally on a single shard, avoiding expensive cross-shard data transfers.
- Optimized Queries: Queries filtering by host_id are efficiently routed to the appropriate shard.
Referential Integrity
If other tables reference the users_t table and are also distributed by host_id, including host_id in the primary key of users_t is essential to maintain referential integrity across shards.

Reference Table

When building a web application, there would be a lot of dropdown selects in forms. The form itself only cares about the id and label list to render the form and only the id will be submitted to the backend API for single select and several ids for multiple select.

To save the effort to create many similar tables, we can craete a set of tables for all dropdowns. For some of the reference tables, dropdown should be the same across all hosts and we can set common flag to 'Y' so that they are shared by all hosts. If the dropdown values might be different between hosts, we can create a reference table per host and link the reference table with host in a separate table that support sharding.

Reference Schema

CREATE TABLE ref_host_t (
  table_id             VARCHAR(22) NOT NULL,
  host_id              VARCHAR(22) NOT NULL,
  PRIMARY KEY (table_id, host_id),
  FOREIGN KEY (table_id) REFERENCES ref_table_t (table_id) ON DELETE CASCADE,
  FOREIGN KEY (host_id) REFERENCES host (host_id) ON DELETE CASCADE
);

CREATE TABLE ref_table_t (
  table_id             VARCHAR(22) NOT NULL, -- UUID genereated by Util
  table_name           VARCHAR(80) NOT NULL, -- Name of the ref table for lookup.
  table_desc           VARCHAR(1024) NULL,
  active               CHAR(1) NOT NULL DEFAULT 'Y', -- Only active table returns values
  editable             CHAR(1) NOT NULL DEFAULT 'Y', -- Table value and locale can be updated via ref admin
  common               CHAR(1) NOT NULL DEFAULT 'Y', -- The drop down shared across hosts
  PRIMARY KEY(table_id)
);


CREATE TABLE ref_value_t (
  value_id              VARCHAR(22) NOT NULL,
  table_id              VARCHAR(22) NOT NULL,
  value_code            VARCHAR(80) NOT NULL, -- The dropdown value
  start_time            TIMESTAMP NULL,       
  end_time              TIMESTAMP NULL,
  display_order         INT,                  -- for editor and dropdown list.
  active                VARCHAR(1) NOT NULL DEFAULT 'Y',
  PRIMARY KEY(value_id),
  FOREIGN KEY table_id REFERENCES ref_table_t (table_id) ON DELETE CASCADE
);


CREATE TABLE value_locale_t (
  value_id              VARCHAR(22) NOT NULL,
  language              VARCHAR(2) NOT NULL,
  value_desc            VARCHAR(256) NULL, -- The drop label in language.
  PRIMARY KEY(value_id,language),
  FOREIGN KEY value_id REFERENCES ref_value_t (value_id) ON DELETE CASCADE
);



CREATE TABLE relation_type_t (
  relation_id           VARCHAR(22) NOT NULL,
  relation_name         VARCHAR(32) NOT NULL, -- The lookup keyword for the relation.
  relation_desc         VARCHAR(1024) NOT NULL,
  PRIMARY KEY(relation_id)
);



CREATE TABLE relation_t (
  relation_id           VARCHAR(22) NOT NULL,
  value_id_from         VARCHAR(22) NOT NULL,
  value_id_to           VARCHAR(22) NOT NULL,
  active                VARCHAR(1) NOT NULL DEFAULT 'Y',
  PRIMARY KEY(relation_id, value_id_from, value_id_to)
  FOREIGN KEY relation_id REFERENCES relation_type_t (relation_id) ON DELETE CASCADE,
  FOREIGN KEY value_id_from REFERENCES ref_value_t (value_id) ON DELETE CASCADE,
  FOREIGN KEY value_id_to REFERENCES ref_table_t (value_id) ON DELETE CASCADE
);

Authentication & Authorization

Light-Portal is a single-page application (SPA) that utilizes both the OAuth 2.0 Authorization Code and Client Credentials flows.

The following pattern illustrates the end-to-end process recommended by the Light Platform for an SPA interacting with downstream APIs.

Sequence Diagram

sequenceDiagram
    participant PortalView as Portal View
    participant LoginView as Login View
    participant Gateway as Light Gateway
    participant OAuthKafka as OAuth-Kafka
    participant AuthService as Auth Service
    participant ProxySidecar as Proxy Sidecar
    participant BackendAPI as Backend API

    PortalView ->> LoginView: 1. Signin redirect
    LoginView ->> OAuthKafka: 2. Authenticate user
    OAuthKafka ->> AuthService: 3. Authenticate User<br/>(Active Directory<br/>for Employees)<br/>(CIF System<br/>for Customers)
    AuthService ->> OAuthKafka: 4. Authenticated
    OAuthKafka ->> OAuthKafka: 5. Generate auth code
    OAuthKafka ->> PortalView: 6. Redirect with code
    PortalView ->> Gateway: 7. Authorization URL<br/>with code param
    Gateway ->> OAuthKafka: 8. Create JWT access<br/>token with code
    OAuthKafka ->> OAuthKafka: 9. Generate JWT<br/>access token<br/>with user claims
    OAuthKafka ->> Gateway: 10. Token returns<br/>to Gateway
    Gateway ->> PortalView: 11. Token returns<br/>to Portal View<br/>in Secure Cookie
    PortalView ->> Gateway: 12. Call Backend API
    Gateway ->> Gateway: 13. Verify the token
    Gateway ->> OAuthKafka: 14. Create Client<br/>Credentials token
    OAuthKafka ->> OAuthKafka: 15. Generate Token<br/>with Scopes
    OAuthKafka ->> Gateway: 16. Return the<br/>scope token
    Gateway ->> Gateway: 17. Add scope<br/>token to<br/>X-Scope-Token<br/>Header
    Gateway ->> ProxySidecar: 18. Invoke API
    ProxySidecar ->> ProxySidecar: 19. Verify<br/>Authorization<br/>token
    ProxySidecar ->> ProxySidecar: 20. Verify<br/>X-Scope-Token
    ProxySidecar ->> ProxySidecar: 21. Fine-Grained<br/>Authorization
    ProxySidecar ->> BackendAPI: 22. Invoke<br/>business API
    BackendAPI ->> ProxySidecar: 23. Business API<br/>response
    ProxySidecar ->> ProxySidecar: 24. Fine-Grained<br/>response filter
    ProxySidecar ->> Gateway: 25. Return response
    Gateway ->> PortalView: 26. Return response

When a user visits the website to access the single-page application (SPA), the Light Gateway serves the SPA to the user's browser. Each single page application will have a dedicated Light Gateway instance acts as a BFF. By default, the user is not logged in and can only access limited site features. To unlock additional features, the user can click the User button in the header and select the Sign In menu. This action redirects the browser from the Portal View to the Login View, both served by the same Light Gateway instance.
On the Login View page, the user can either input a username and password or choose Google/Facebook for authentication. When the login form is submitted, the request is sent to the Light Gateway with the user's credentials. The Gateway forwards this request to the OAuth Kafka service.
OAuth Kafka supports multiple authenticator implementations to verify user credentials. Examples include authenticating via the Light Portal user database, Active Directory for employees, or CIF service for customers.
Once authentication is successfully completed, the OAuth Kafka responds with the authentication result.
Upon successful authentication, OAuth Kafka generates an authorization code (a UUID associated with the user's profile).
OAuth Kafka redirects the authorization code back to the browser at the Portal View via the Gateway.
Since the Portal View SPA lacks a dedicated redirect route for the authorization code, the browser sends the code as a query parameter in a request to the Gateway.
The StatelessAuthHandler in the Gateway processes this request, initiating a token request to OAuth Kafka to obtain a JWT access token.
OAuth Kafka generates an access token containing user claims in its custom JWT claims. The authorization code is then invalidated, as it is single-use.
The access token is returned to the Gateway.
The StatelessAuthHandler in the Gateway stores the access token in a secure cookie and sends it back to the Portal View.
When the Portal View SPA makes requests to backend APIs, it includes the secure cookie in the API request sent to the Gateway.
The StatelessAuthHandler in the Gateway validates the token in the secure cookie and places it in the Authorization header of the outgoing request.
If the token is successfully validated, the TokenHandler in the Gateway makes a request to OAuth Kafka for a client credentials token, using the path prefix of the API endpoint.
OAuth Kafka generates a client credentials token with the appropriate scope for accessing the downstream service.
The client credentials token is returned to the Gateway.
The TokenHandler in the Gateway inserts this token into the X-Scope-Token header of the original request.
The Gateway routes the original request, now containing both tokens, to the downstream proxy sidecarof the backend API.
The proxy sidecar validates the Authorization token, verifying its signature, expiration, and other attributes.
The proxy sidecar also validates the X-Scope-Token, ensuring its signature, expiration, and scope are correct.
Once both tokens are successfully validated, the proxy sidecar enforces fine-grained authorization rules based on the user's custom security profile contained in the Authorization token.
If the fine-grained authorization checks are passed, the proxy sidecar forwards the request to the backend API.
The backend API processes the request and sends the full response back to the proxy sidecar.
The proxy sidecar applies fine-grained filters to the response, reducing the number of rows and/or columns based on the user's security profile or other policies.
The proxy sidecar returns the filtered response to the Gateway.
The Gateway forwards the response to the Portal View, allowing the SPA to render the page.

Fine-Grained Authorization

What is Fine-Grained Authorization?

Fine-grained authorization (FGA) refers to a detailed and precise control mechanism that governs access to resources based on specific attributes, roles, or rules. It's also known as fine-grained access control (FGAC). Unlike coarse-grained authorization, which applies broader access policies (e.g., "Admins can access everything"), fine-grained authorization allows for more specific policies (e.g., "Admins can access user data only if they belong to the same department and the access request is during business hours").

Key Features

Granular Control: Policies are defined at a detailed level, considering attributes like user role, resource type, action, time, location, etc.
Context-Aware: Takes into account dynamic conditions such as the time of request, user’s location, or other contextual factors.
Flexible Policies: Allows the creation of complex, conditional rules tailored to the organization’s needs.

Why Do We Need Fine-Grained Authorization?

1. Enhanced Security

By limiting access based on detailed criteria, fine-grained authorization minimizes the risk of unauthorized access or data breaches.

2. Regulatory Compliance

It helps organizations comply with legal and industry-specific regulations (e.g., GDPR, HIPAA) by ensuring sensitive data is only accessible under strict conditions.

3. Minimized Attack Surface

By restricting access to only the required resources and operations, fine-grained authorization reduces the potential impact of insider threats or compromised accounts.

4. Improved User Experience

Enables personalized access based on roles and permissions, ensuring users see only what they need, which reduces confusion and improves productivity.

5. Auditing and Accountability

Detailed access logs and policy enforcement make it easier to track and audit who accessed what, when, and why, fostering better accountability.

Examples of Use Cases

Healthcare: A doctor can only view records of patients they are treating.
Government: A government employee can access to data and documents based on security clearance levels and job roles.
Finance: A teller can only access transactions related to their assigned branch.
Enterprise Software: Employees can edit documents only if they own them or have been granted editing permissions.

Fine-Grained Authorization in API Access Control

In API access control, fine-grained authorization governs how users or systems interact with specific API endpoints, actions, and data. This approach ensures that access permissions are precisely tailored to attributes, roles, and contextual factors, enabling a secure and customized API experience. As the Light Portal is a platform centered on APIs, the remainder of the design will focus on the API access control context.

Early Approaches to Fine Grained Authorization

Early approaches to fine grained authorization primarily involved Access Control Lists (ACLs) and Role-Based Access Control (RBAC). These methods laid the foundation for more sophisticated access control mechanisms that followed. Here's an overview of these primary approaches:

Access Control Lists (ACLs):

ACLs were one of the earliest forms of fine grained authorization, allowing administrators to specify access permissions on individual resources for each user or group of users.
In ACLs, permissions are directly assigned to users or groups, granting or denying access to specific resources based on their identities.
While effective for small-scale environments with limited resources and users, ACLs became cumbersome as organizations grew. Maintenance issues arose, such as the time required to manage access to an increasing number of resources for numerous users.

Role-Based Access Control (RBAC):

RBAC emerged as a solution to the scalability and maintenance challenges posed by ACLs. It introduced the concept of roles, which represent sets of permissions associated with particular job functions or responsibilities.
Users are assigned one or more roles, and their access permissions are determined by the roles they possess rather than their individual identities.
RBAC can be implemented with varying degrees of granularity. Roles can be coarse-grained, providing broad access privileges, or fine-grained, offering more specific and nuanced permissions based on organizational needs.
Initially, RBAC appeared to address the limitations of ACLs by providing a more scalable and manageable approach to access control.

Both ACLs and RBAC have their shortcomings:

Maintenance Challenges: While RBAC offered improved scalability compared to ACLs, it still faced challenges with role management as organizations expanded. The proliferation of roles, especially fine grained ones, led to a phenomenon known as role explosion where the number of roles grew rapidly, making them difficult to manage effectively.
Security Risks: RBAC's flexibility also posed security risks. Over time, users might accumulate permissions beyond what they need for their current roles, leading to a phenomenon known as permission creep. This weakened overall security controls and increased the risk of unauthorized access or privilege misuse.

Following the discussion of early approaches to fine grained authorization, it's crucial to acknowledge that different applications have varying needs for authorization.

Whether to use fine grained or coarse-grained controls depends on the specific project. Controlling access becomes trickier due to the spread-out nature of resources and differing levels of detail needed across components. Let’s delve into the differentiating factors:

Standard Models for Implementing FGA

There are several standard models for implementing FGA:

Attribute-Based Access Control (ABAC): In ABAC, access control decisions are made by evaluating attributes such as user roles, resource attributes (e.g., type, size, status), requested action, current date and time, and any other relevant contextual information. ABAC allows for very granular control over access based on a wide range of attributes.
Policy-Based Access Control (PBAC): PBAC is similar to ABAC but focuses more on defining policies than directly evaluating attributes. Policies in PBAC typically consist of rules or logic that dictate access control decisions based on various contextual factors. While ABAC relies heavily on data (attributes), PBAC emphasizes using logic to determine access.
Relationship-Based Access Control (ReBAC): ReBAC emphasizes the relationships between users and resources, as well as relationships between different resources. By considering these relationships, ReBAC provides a powerful and expressive model for describing complex authorization contexts. This can involve the attributes of users and resources and their interactions and dependencies.

Each of these models offers different strengths and may be more suitable for different scenarios. FGA allows for fine grained control over access, enabling organizations to enforce highly specific access policies tailored to their requirements.

Streamlining FGA by Implementing Rule-Based Access Control:

ABAC (Attribute-Based Access Control) focuses on data attributes, PBAC (Policy-Based Access Control) centers on logic, and ReBAC (Relationship-Based Access Control) emphasizes relationships between users and resources. But what if we combined all three to leverage the strengths of each? This is the idea behind Rule-Based Access Control (RuBAC).

By embedding a lightweight rule engine, we can integrate multiple rules and actions to achieve the following:

Optimize ABAC: Reduce the number of required attributes since not all rules depend on them. For example, a standard rule like "Customer data can only be accessed during working hours" can be shared across policies.
Flexible Policy Enforcement: Using a rule engine makes access policies more dynamic and simpler to manage.
Infer Relationships: Automatically deduce relationships between entities. For instance, the rule engine could grant a user access to a file if they already have permission for the containing folder.

Principle of Least Privilege

The principle of least privilege access control widely referred to as least privilege, and PoLP is the security concept in which user(s) (employee(s)) are granted the minimum level of access/permissions to the app, data, or system that is required to perform his/her job functions.

To ensure PoLP is effectively enforced, we've compiled a list of best practices:

Conduct a thorough privilege audit: As we know, visibility is critical in an access environment, so conducting regular or periodic access audits of all privileged accounts can help your team gain complete visibility. This audit includes reviewing privileged accounts and credentials held by employees, contractors, and third-party vendors, whether on-premises, accessible remotely, or in the cloud. However, your team must also focus on default and hard-coded credentials, which IT teams often overlook.
Establish the least privilege as the default: Start by granting new accounts the minimum privileges required for their tasks and eliminate or reconfigure default permissions on new systems or applications. Further, use role-based access control to help your team determine the necessary privileges for a new account by providing general guidelines based on roles and responsibilities. Also, your team needs to update and adjust access level permissions when the user's role changes; this will help prevent privilege creep.
Enforce separation of privileges: Your team can prevent over-provisioning by limiting administrator privileges. Firstly, segregate administrative accounts from standard accounts, even if they belong to the same user, and isolate privileged user sessions. Then, grant administrative privileges (such as read, write, and execute permissions) only to the extent necessary for the user to perform their specific administrative tasks. This will help your team prevent granting users unnecessary or excessive control over critical systems, which could lead to security vulnerabilities or misconfigurations.
Provide just-in-time, limited access: To maintain least-privilege access without hindering employee workflows, combine role-based access control with time-limited privileges. Further, replace hard-coded credentials with dynamic secrets or use one-time-use/temporary credentials. This will help your team grant temporary elevated access permissions when users need it, for instance, to complete specific tasks or short-term projects.
Keep track and evaluate privileged access: Continuously monitor authentications and authorizations across your API platform and ensure all the individual actions are traceable. Additionally, record all authentication and authorizaiton sessions comprehensively, and use automated tools to swiftly identify any unusual activity or potential issues. These best practices are designed to enhance the security of your privileged accounts, data, and assets while ensuring compliance adherence and improving operational security without disrupting user workflows.

OpenAPI Specification Extensions

OpenAPI uses the term security scheme for authentication and authorization schemes. OpenAPI 3.0 lets you describe APIs protected using the following security schemes. The fine-grained authorization is just another layer of security and it is natural to define the fine-grained authorization in the same specification. It is can be done with OpenAPI specification extensions.

Extensions (also referred to as specification extensions or vendor extensions) are custom properties that start with x-, such as x-logo. They can be used to describe extra functionality that is not covered by the standard OpenAPI Specification. Many API-related products that support OpenAPI make use of extensions to document their own attributes, such as Amazon API Gateway, ReDoc, APIMatic and others.

As OpenAPI specification openapi.yaml is loaded during the light-4j startup, the extensions will be available at runtime in cache for each endpoint just like the scopes definition. The API owner can define the following two extensions for each endpoint:

x-request-access: This section allows designer to specify one or more rules as well as one or more security attributes for the input of the rules. For example, roles, location etc. The rule result will decide if the user has access to the endpoint based on the security attributes from the JWT token in the request chain.
x-response-filter: This section is similar to the above; however, it works on the response chain. The rule result will decide which row or column of the response JSON will return to the user based on the security profile from the JWT token.

Example of OpenAPI specification with fine-grained authorization.

paths:
  /accounts:
    get:
      summary: "List all accounts"
      operationId: "listAccounts"
      x-request-access:
        rule: "account-cc-group-role-auth"
        roles: "manager teller customer"
      x-response-filter:
        rule: "account-row-filter"
        teller: 
          status: open
        customer:
          status: open
          owner: @user_id          
        rule: "account-col-filter"
          teller: ["num","owner","type","firstName","lastName","status"]
          customer: ["num","owner","type","firstName","lastName"]
      security:
      - account_auth:
        - "account.r"

FGA Rules for AccessControlHandler

With the above specification loaded during the runtime, the rules will be loaded during the server startup for the service as well. In the Rule Registry on the light-portal, we have a set of built-in rules that can be picked as fine-grained policies for each API. Here is an example of rule for the above specification in the x-request-access.

account-cc-group-role-auth:
  ruleId: account-cc-group-role-auth
  host: lightapi.net
  description: Role-based authorization rule for account service and allow cc token and transform group to role.
  conditions:
    - conditionId: allow-cc
      variableName: auditInfo
      propertyPath: subject_claims.ClaimsMap.user_id
      operatorCode: NIL
      joinCode: OR
      index: 1
    - conditionId: manager
      variableName: auditInfo
      propertyPath: subject_claims.ClaimsMap.groups
      operatorCode: CS
      joinCode: OR
      index: 2
      conditionValues:
        - conditionValueId: manager
          conditionValue: admin
    - conditionId: teller
      variableName: auditInfo
      propertyPath: subject_claims.ClaimsMap.groups
      operatorCode: CS
      joinCode: OR
      index: 3
      conditionValues:
        - conditionValueId: teller
          conditionValue: frontOffice
    - conditionId: allow-role-jwt
      variableName: auditInfo
      propertyPath: subject_claims.ClaimsMap.roles
      operatorCode: NNIL
      joinCode: OR
      index: 4
  actions:
    - actionId: match-role
      actionClassName: com.networknt.rule.FineGrainedAuthAction
      actionValues:
        - actionValueId: roles
          value: $roles

All rules are managed by the light-portal and shared by all the services. In addition, developers can create their customized rules for their own services.

JSON Schema Registry

JSON Schema is a declarative language that provides a standardized way to describe and validate JSON data.

What it does

JSON Schema defines the structure, content, data types, and constraints of JSON documents. It's an IETF standard that helps ensure the consistency and integrity of JSON data across applications.

How it works

JSON Schema uses keywords to define data properties. A JSON Schema validator checks if JSON documents conform to the schema.

What it's useful for

Describing existing data formats
Validating data as part of automated testing
Submitting client data
Defining how a record should be organized

What is a JSON Schema Registry

The JSON Schema Registry provides a centralized service for your JSON schemas with RESTful endpoints for storing and retrieving JSON schemas.

When using data in a distributed application with many RESTful APIs, it is important to ensure that it is well-formed and structured. If data is sent without prior validation, errors may occur on the services. A schema registry provides a way to ensure that the data is validated before it is sent and validated after it is received.

A schema registry is a service used to define and confirm the structure of data that is sent between consumers and providers. In a schema registry, developers can define what the data should look like and how it should be validated. The schemas can be utilized in the OpenAPI specifications to ensure that schemas can be externalized.

Schema records can also help ensure forward and backward compatibility when changes are made to the data structure. When a schema record is used, the data transfered with more schema information that can be used to ensure that applications reading the data can interpret it.

Given the API consumers and providers can belong to different groups or organizations, it is necessary to have a centralized service to manage the schemas so that they can be shared between them. This is why we have implemented this service as part of the light-portal.

Schema Specification Version

The registry is heterogeneous registry as it can store schemas of different schema draft versions. By default the registry is configured to store schemas of Draft 2020-12. When a schema is added, the version which is currently is set, is what the schema is saved as.

The following list contains all supported specification versions.

Draft 4
Draft 6
Draft 7
2019-09
2020-12

Schema Version

Once a schema is registed into the registry, it will be assigned as version 1. Each time it is updated, the version number will increase 1. When the schema is retrieve, the version number can be part of the URL to indicate that exact version will be retrieved. If version number is not in the URL, the latest version will be retrieved.

Access Endpoint

Table Structure

Light Controller

YAML Rule Registry

React Schema Form

React Schema Form is a form generator based on JSON Schema and form definitions from Light Portal. It renders UI forms to manipulate database entities, and form submissions are automatically hooked into an API endpoint.

Debugging a Component

Encountering a bug in a react-schema-form component can be challenging since the source code may not be directly visible. To debug:

Set up the Light Portal server if dropdowns are loaded from the server.
Use the example app in the same project to debug.

Use a Local Alias with Vite

Vite allows creating an alias to point to your library's src folder. Update the vite.config.ts in your example app:

import { defineConfig } from 'vite';
import react from '@vitejs/plugin-react';
import path from 'path';

export default defineConfig({
  plugins: [react()],
  resolve: {
    alias: {
      'react-schema-form': path.resolve(__dirname, '../src'), // Adjust the path to point to the library's `src` folder
    },
  },
});

Use a Link Script in `package.json`

Update the example app's package.json file. In the dependencies section, replace the library's version with a local path:

{
  "dependencies": {
    "react-schema-form": "file:../src"
  }
}

Library Entry Point

Vite requires an entry point file, typically named index.js or index.ts, in your library's src folder. Ensure that your library's src folder includes a properly configured index.js file, like this:

export { default as SchemaForm } from './SchemaForm'
export { default as ComposedComponent } from './ComposedComponent'
export { default as utils } from './utils'
export { default as Array } from './Array'

Without a correctly named and configured entry file, components like SchemaForm may not be imported properly.

Update `index.html`

If you change the entry point file from main.js to index.js, ensure you update the reference in the index.html file located in the root folder. For example:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <link rel="icon" type="image/svg+xml" href="/vite.svg" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Vite + React</title>
  </head>
  <body>
    <div id="root"></div>
    <script type="module" src="/src/index.js"></script>
  </body>
</html>

Sync `devDependencies` from `peerDependencies`

When the source code in src is used directly by the example app, the peerDependencies in the example app won't work for react-schema-form components. To address this, copy the peerDependencies into the devDependencies section of react-schema-form's package.json. For example:

  "devDependencies": {
    "@babel/runtime": "^7.26.0",
    "@codemirror/autocomplete": "^6.18.2",
    "@codemirror/language": "^6.10.6",
    "@codemirror/lint": "^6.8.2",
    "@codemirror/search": "^6.5.7",
    "@codemirror/state": "^6.4.1",
    "@codemirror/theme-one-dark": "^6.1.2",
    "@codemirror/view": "^6.34.2",
    "@emotion/react": "^11.13.5",
    "@emotion/styled": "^11.13.5",
    "@eslint/js": "^9.13.0",
    "@lezer/common": "^1.2.3",
    "@mui/icons-material": "^6.1.6",
    "@mui/material": "^6.1.6",
    "@mui/styles": "^6.1.6",
    "@types/react": "^18.3.1",
    "@uiw/react-markdown-editor": "^6.1.2",
    "@vitejs/plugin-react": "^4.3.3",
    "codemirror": "^6.0.1",
    "eslint": "^9.13.0",
    "eslint-plugin-react": "^7.37.2",
    "eslint-plugin-react-hooks": "^5.0.0",
    "eslint-plugin-react-refresh": "^0.4.14",
    "gh-pages": "^6.2.0",
    "globals": "^15.11.0",
    "react": "^18.3.1",
    "react-dom": "^18.3.1",
    "vite": "^6.0.3"
  },
  "peerDependencies": {
    "@babel/runtime": "^7.26.0",
    "@codemirror/autocomplete": "^6.18.2",
    "@codemirror/language": "^6.10.6",
    "@codemirror/lint": "^6.8.2",
    "@codemirror/search": "^6.5.7",
    "@codemirror/state": "^6.4.1",
    "@codemirror/theme-one-dark": "^6.1.2",
    "@codemirror/view": "^6.34.2",
    "@emotion/react": "^11.13.5",
    "@emotion/styled": "^11.13.5",
    "@lezer/common": "^1.2.3",
    "@mui/icons-material": "^6.1.6",
    "@mui/material": "^6.1.6",
    "@mui/styles": "^6.1.6",
    "@types/react": "^18.3.1",
    "@uiw/react-markdown-editor": "^6.1.2",
    "codemirror": "^6.0.1",
    "react": "^18.3.1",
    "react-dom": "^18.3.1"
  },

Additionally, ensure the peerDependencies are also synced with the dependencies section of the example app's package.json. This step allows react-schema-form components to load independently and work seamlessly during development.

Update Source Code

After completing all the updates, perform a clean install for both react-schema-form and the example app. Then, start the server from the example folder using the following command:

yarn dev

Whenever you modify a react-schema-form component, simply refresh the browser to reload the example application and see the updated component in action.

Debug with Visual Studio Code

You can debug the component using Visual Studio Code. There are many tutorials available online that explain how to debug React applications built with Vite, which can help you set up breakpoints, inspect components, and track down issues effectively.

Component dynaselect

dynaselect is a component that renders a dropdown select, either from static options or options loaded dynamically from a server via an API endpoint. It is a wrapper of material ui Autocomplete component. Below is an example form from the example app that demonstrates how to use this component.

{
  "schema": {
    "type": "object",
    "title": "React Component Autocomplete Demo Static Single",
    "properties": {
      "name": {
        "title": "Name",
        "type": "string",
        "default": "Steve"
      },
      "host": {
        "title": "Host",
        "type": "string"
      },
      "environment": {
        "type": "string",
        "title": "Environment",
        "default": "LOCAL",
        "enum": [
          "LOCAL",
          "SIT1",
          "SIT2",
          "SIT3",
          "UAT1",
          "UAT2"
        ]
      },
      "stringarraysingle": {
        "type": "array",
        "title": "Single String Array",
        "items": {
          "type": "string"
        }
      },
      "stringcat": {
        "type": "string",
        "title": "Joined Strings"
      },
      "stringarraymultiple": {
        "type": "array",
        "title": "Multiple String Array",
        "items": {
          "type": "string"
        }
      }
    },
    "required": [
      "name",
      "environment"
    ]
  },
  "form": [
    "name",
    {
      "key": "host",
      "type": "dynaselect",
      "multiple": false,
      "action": {
        "url": "https://localhost/portal/query?cmd=%7B%22host%22%3A%22lightapi.net%22%2C%22service%22%3A%22user%22%2C%22action%22%3A%22listHost%22%2C%22version%22%3A%220.1.0%22%7D"
      }
    },
    {
      "key": "environment",
      "type": "dynaselect",
      "multiple": false,
      "options": [
        {
          "id": "LOCAL",
          "label": "Local"
        },
        {
          "id": "SIT1",
          "label": "SIT1"
        },
        {
          "id": "SIT2",
          "label": "SIT2"
        },
        {
          "id": "SIT3",
          "label": "SIT3"
        },
        {
          "id": "UAT1",
          "label": "UAT1"
        },
        {
          "id": "UAT2",
          "label": "UAT2"
        }
      ]
    },
    {
      "key": "stringarraysingle",
      "type": "dynaselect",
      "multiple": false,
      "options": [
        {
          "id": "id1",
          "label": "label1"
        },
        {
          "id": "id2",
          "label": "label2"
        },
        {
          "id": "id3",
          "label": "label3"
        },
        {
          "id": "id4",
          "label": "label4"
        },
        {
          "id": "id5",
          "label": "label5"
        },
        {
          "id": "id6",
          "label": "label6"
        }
      ]
    },
    {
      "key": "stringcat",
      "type": "dynaselect",
      "multiple": true,
      "options": [
        {
          "id": "id1",
          "label": "label1"
        },
        {
          "id": "id2",
          "label": "label2"
        },
        {
          "id": "id3",
          "label": "label3"
        },
        {
          "id": "id4",
          "label": "label4"
        },
        {
          "id": "id5",
          "label": "label5"
        },
        {
          "id": "id6",
          "label": "label6"
        }
      ]
    },
    {
      "key": "stringarraymultiple",
      "type": "dynaselect",
      "multiple": true,
      "options": [
        {
          "id": "id1",
          "label": "label1"
        },
        {
          "id": "id2",
          "label": "label2"
        },
        {
          "id": "id3",
          "label": "label3"
        },
        {
          "id": "id4",
          "label": "label4"
        },
        {
          "id": "id5",
          "label": "label5"
        },
        {
          "id": "id6",
          "label": "label6"
        }
      ]
    }
  ]
}

Dynamic Options from APIs

The host is a string type field rendered as a dynaselect with multiple set to false. The options for the select are loaded via an API endpoint, with the action URL provided. Note that the cmd query parameter value is encoded because it contains curly brackets {}.

To encode and decode the query parameter value, you can use the following tool:

Encoder/Decoder Tool

Encoded:

%7B%22host%22%3A%22lightapi.net%22%2C%22service%22%3A%22user%22%2C%22action%22%3A%22listHost%22%2C%22version%22%3A%220.1.0%22%7D

Decoded:

{"host":"lightapi.net","service":"user","action":"listHost","version":"0.1.0"}

When using the example app to test the react-schema-form with APIs, you need to configure CORS on the light-gateway. Ensure that CORS is enabled only on the light-gateway and not on the backend API, such as hybrid-query.

Here is the example in values.yml for the light-gateway.

# cors.yml
cors.enabled: true
cors.allowedOrigins:
  - https://devsignin.lightapi.net
  - https://dev.lightapi.net
  - https://localhost:3000
  - http://localhost:5173
cors.allowedMethods:
  - GET
  - POST
  - PUT
  - DELETE

Single string type

For the environment field, the schema defines the type as string, and the form definition specifies multiple: false to indicate it is a single select.

The select result in the model looks like the following:

{
  "environment": "SIT1",
}

Single string array type

For the stringarraysingle field, the schema defines the type as a string array, and the form definition specifies multiple: false to indicate it is a single select.

The select result in the model looks like the following:

{
  "stringarraysingle": [
    "id3"
  ],	
}

Multiple string type

For the stringcat field, the schema defines the type as a string, and the form definition specifies multiple: true to indicate it is a multiple select.

The select result in the model looks like the following:

{
	"stringcat": "id2,id4"
}

Multiple string array type

For the stringarraymultiple field, the schema defines the type as a string array, and the form definition specifies multiple: true to indicate it is a multiple select.

The select result in the model looks like the following:

{
  "stringarraymultiple": [
    "id2",
    "id5",
    "id3"
  ],	
}

User Management

User Type

The user_type field is a critical part of the user security profile in the JWT token and can be leveraged for fine-grained authorization. In a multi-tenant environment, user_type is presented as a dropdown populated from the reference table configured for the organization. It can be dynamically selected based on the host chosen during the user registration process.

Employee and Customer
- Dropdown values: E (Employee), C (Customer)
- Default model for lightapi.net host.
- Suitable for most organizations.
Employee, Personal, and Business
- Dropdown values:
  - E (Employee)
  - P (Personal)
  - B (Business)
- Commonly used for banks where personal and business banking are separated.

Database Configuration

The user_type field is nullable in the user_t table by default.
However, you can enforce this field as mandatory in your application via the schema and UI configuration.

On-Prem Deployment

In on-premise environments, the user_type can determine the authentication method:

Employees: Authenticated via Active Directory.
Customers: Authenticated via a customer database.

This flexibility allows organizations to tailor the authentication process based on their specific needs and user classifications.

Handling Users with Multi-Host Access

There are two primary ways to handle users who belong to multiple hosts:

User-Host Mapping Table:

user_t: This table would not have a host_id and would store core user information that is host-independent. The user_id would be unique across all hosts.

user_host_t (or user_tenant_t): This would be a mapping table to represent the many-to-many relationship between users and hosts.

-- user_t (no host_id, globally unique user_id)
CREATE TABLE user_t (
    user_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), -- UUID is recommended
    -- ... other user attributes (e.g., name, email) 
);

-- user_host_t (mapping table)
CREATE TABLE user_host_t (
    user_id UUID NOT NULL,
    host_id UUID NOT NULL,
    -- ... other relationship-specific attributes (e.g., roles within the host)
    PRIMARY KEY (user_id, host_id),
    FOREIGN KEY (user_id) REFERENCES user_t (user_id) ON DELETE CASCADE,
    FOREIGN KEY (host_id) REFERENCES host_t (host_id) ON DELETE CASCADE -- Assuming you have a hosts_t
);

Duplicating User Records (Less Recommended):

user_t: You would keep host_id in this table, and the primary key would be (host_id, user_id).

User Duplication: If a user needs access to multiple hosts, you would duplicate their user record in users_t for each host they belong to, each with a different host_id.

Why User-Host Mapping is Generally Preferred:

Data Integrity: Avoids data duplication and the potential for inconsistencies that come with it. If a user's core information (e.g., name, email) changes, you only need to update it in one place in user_t.
Flexibility: Easier to add or remove a user's access to hosts without affecting their core user data.
Querying: While you'll need joins to get a user's hosts or a host's users, these joins are straightforward using the mapping table.
Scalability: Better scalability as your user base and the number of hosts they can access grow.

Distributing Tables in a Multi-Host User Scenario:

With the user-host mapping approach:

user_t: This table would likely be a reference table in Citus (replicated to all nodes) since it does not have a host_id for distribution.
user_host_t: This table would be distributed by host_id.
Other tables (e.g., employees_t, api_endpoints_t, etc.): These would be distributed by host_id as before.

When querying, you would typically:

Start with the user_hosts_t table to find the hosts a user has access to.
Join with other tables (distributed by host_id) based on the host_id to retrieve tenant-specific data.

Choosing the Right user_id Primary Key:

Here's a comparison of the options for the user_id primary key in user_t:

1. UUID (`user_id`)

Pros:
- Globally Unique: Avoids collisions across hosts or when scaling beyond the current setup.
- Security: Difficult to guess or enumerate.
- Scalability: Well-suited for distributed environments like Citus.
Cons:
- Storage: Slightly larger storage size compared to integers.
- Readability: Not human-readable, which can be inconvenient for debugging.
Recommendation:
This is generally the best option for a user_id in a multi-tenant, distributed environment.

2. Email (`email`)

Pros:
- Human-Readable: Easy to identify and manage.
- Login Identifier: Often used as a natural login credential.
Cons:
- Uniqueness Challenges: Enforcing global uniqueness across all hosts may require complex constraints or application logic.
- Changeability: If emails change, cascading updates can complicate the database.
- Security: Using emails as primary keys can expose sensitive user data if not handled securely.
- Performance: String comparisons are slower than those for integers or UUIDs.
Recommendation:
Not recommended as a primary key, especially in a multi-tenant or distributed setup.

3. User-Chosen Unique ID (e.g., `username`)

Pros:
- Human-Readable: Intuitive and user-friendly.
Cons:
- Uniqueness Challenges: Enforcing global uniqueness is challenging and may require complex constraints.
- Changeability: Users may request username changes, causing cascading update issues.
- Security: Usernames are easier to guess or enumerate compared to UUIDs.
Recommendation:
Not recommended as a primary key in a multi-tenant, distributed environment.

In Conclusion:

Use a User-Host Mapping Table:
This is the best approach to handle users who belong to multiple hosts in a multi-tenant Citus environment.
Use UUID for user_id:
UUIDs are the most suitable option for the user_id primary key in user_t due to their global uniqueness, security, and scalability.
Distribute by host_id:
Distribute tables that need sharding by host_id, and ensure that foreign keys to distributed tables include host_id.
Use Reference Tables:
For tables like user_t that don't have a host_id, designate them as reference tables in Citus.

This approach provides a flexible and scalable foundation for managing users with multi-host access in your Citus-based multi-tenant application.

User Tables

Using a single user_t table with a user_type discriminator is a good approach for managing both employees and customers in a unified way. Adding optional referral relationships for customers adds a nice dimension as well. Here's a suggested table schema in PostgreSQL, along with explanations and some considerations:

user_t (User Table): This table will store basic information common to both employees and customers.

CREATE TABLE user_t (
    user_id                   VARCHAR(24) NOT NULL,
    email                     VARCHAR(255) NOT NULL,
    password                  VARCHAR(1024) NOT NULL,
    language                  CHAR(2) NOT NULL,
    first_name                VARCHAR(32) NULL,
    last_name                 VARCHAR(32) NULL,
    user_type                 CHAR(1) NULL, -- E employee C customer or E employee P personal B business
    phone_number              VARCHAR(20) NULL,
    gender                    CHAR(1) NULL,
    birthday                  DATE NULL,
    country                   VARCHAR(3) NULL,
    province                  VARCHAR(32) NULL,
    city                      VARCHAR(32) NULL,
    address                   VARCHAR(128) NULL,
    post_code                 VARCHAR(16) NULL,
    verified                  BOOLEAN NOT NULL DEFAULT false,
    token                     VARCHAR(64) NULL,
    locked                    BOOLEAN NOT NULL DEFAULT false,
    nonce                     BIGINT NOT NULL DEFAULT 0,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL
);

ALTER TABLE user_t ADD CONSTRAINT user_pk PRIMARY KEY ( user_id );

ALTER TABLE user_t ADD CONSTRAINT user_email_uk UNIQUE ( email );

user_host_t (User to host relationship or mapping):

CREATE TABLE user_host_t (
    host_id                   VARCHAR(24) NOT NULL,
    user_id                   VARCHAR(24) NOT NULL,
    -- other relationship-specific attributes (e.g., roles within the host)
    PRIMARY KEY (host_id, user_id),
    FOREIGN KEY (user_id) REFERENCES user_t (user_id) ON DELETE CASCADE,
    FOREIGN KEY (host_id) REFERENCES host_t (host_id) ON DELETE CASCADE
);

employee_t (Employee Table): This table will store employee-specific attributes.

CREATE TABLE employee_t (
    host_id                   VARCHAR(22) NOT NULL,
    employee_id               VARCHAR(50) NOT NULL,  -- Employee ID or number or ACF2 ID. Unique within the host. 
    user_id                   VARCHAR(22) NOT NULL,
    title                     VARCHAR(255) NOT NULL,
    manager_id                VARCHAR(50), -- manager's employee_id if there is one.
    hire_date                 DATE,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, employee_id),
    FOREIGN KEY (host_id, user_id) REFERENCES user_host_t(host_id, user_id) ON DELETE CASCADE,
    FOREIGN KEY (host_id, manager_id) REFERENCES employee_t(host_id, employee_id) ON DELETE CASCADE
);

customer_t (Customer Table): This table will store customer-specific attributes.

CREATE TABLE customer_t (
    host_id                   VARCHAR(24) NOT NULL,
    customer_id               VARCHAR(50) NOT NULL,
    user_id                   VARCHAR(24) NOT NULL,
    -- Other customer-specific attributes
    referral_id               VARCHAR(22), -- the customer_id who refers this customer. 
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, customer_id),
    FOREIGN KEY (host_id, user_id) REFERENCES user_host_t(host_id, user_id) ON DELETE CASCADE,
    FOREIGN KEY (host_id, referral_id) REFERENCES customer_t(host_id, customer_id) ON DELETE CASCADE
);

position_t (Position Table): Defines different positions within the organization for employees.

CREATE TABLE position_t (
    host_id                   VARCHAR(22) NOT NULL,
    position_id               VARCHAR(22) NOT NULL,
    position_name             VARCHAR(255) UNIQUE NOT NULL,
    description               TEXT,
    inherit_to_ancestor       BOOLEAN DEFAULT FALSE,
    inherit_to_sibling        BOOLEAN DEFAULT FALSE,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, position_id)
);

employee_position_t (Employee Position Table): Links employees to their positions with effective dates.

CREATE TABLE employee_position_t (
    host_id                   VARCHAR(22) NOT NULL,
    employee_id               VARCHAR(50) NOT NULL,
    position_id               VARCHAR(22) NOT NULL,
    position_type             CHAR(1) NOT NULL, -- P position of own, D inherited from a decendant, S inherited from a sibling.
    start_date                DATE NOT NULL,
    end_date                  DATE,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, employee_id, position_id),
    FOREIGN KEY (host_id, position_id) REFERENCES position_t(host_id, position_id) ON DELETE CASCADE
);

Authorization Strategies

In order to link users to API endpoints for authorization, we will adpot the following approaches with a rule engine to enforce the policies in the sidecar of the API with access-control middleware handler.

A. Role-Based Access Control (RBAC)

This is a common and relatively simple approach. You define roles (e.g., "admin," "editor," "viewer") and assign permissions to those roles. Users are then assigned to one or more roles.

Role Table:

CREATE TABLE role_t (
    host_id                   VARCHAR(22) NOT NULL,
    role_id                   VARCHAR(22) NOT NULL,
    role_name                 VARCHAR(255) UNIQUE NOT NULL,
    description               TEXT,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, role_id)
);

Role-Endpoint Permission Table:

CREATE TABLE role_permission_t (
    host_id                   VARCHAR(32) NOT NULL,
    role_id                   VARCHAR(32) NOT NULL,
    endpoint_id               VARCHAR(64) NOT NULL,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, role_id, endpoint_id),
    FOREIGN KEY (host_id, role_id) REFERENCES role_t(host_id, role_id) ON DELETE CASCADE,
    FOREIGN KEY (endpoint_id) REFERENCES api_endpoint_t(endpoint_id) ON DELETE CASCADE
);

Role-User Assignment Table:

CREATE TABLE role_user_t (
    host_id                   VARCHAR(22) NOT NULL,
    role_id                   VARCHAR(22) NOT NULL,
    user_id                   VARCHAR(22) NOT NULL,
    start_date DATE NOT NULL DEFAULT CURRENT_DATE,
    end_date DATE,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, role_id, user_id, start_date),
    FOREIGN KEY (user_id) REFERENCES user_t(user_id) ON DELETE CASCADE,
    FOREIGN KEY (host_id, role_id) REFERENCES role_t(host_id, role_id) ON DELETE CASCADE
);

B. User-Based Access Control (UBAC)

This approach assigns permissions directly to users, allowing for very fine-grained control. It's more flexible but can become complex to manage if you have a lot of users and endpoints. It should only be used for temporary access.

User-Endpoint Permissions Table:

CREATE TABLE user_permission_t (
    user_id                   VARCHAR(22) NOT NULL,
    host_id                   VARCHAR(22) NOT NULL,
    endpoint_id               VARCHAR(22) NOT NULL,
    start_date DATE NOT NULL DEFAULT CURRENT_DATE,
    end_date DATE,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (user_id, host_id, endpoint_id),
    FOREIGN KEY (user_id) REFERENCES user_t(user_id) ON DELETE CASCADE,
    FOREIGN KEY (endpoint_id) REFERENCES api_endpoint_t(endpoint_id) ON DELETE CASCADE
);

C. Group-Based Access Control (GBAC)

You can group users into teams or departments and assign permissions to those groups. This is useful when you want to manage permissions for sets of users with similar access needs.

Groups Table:

CREATE TABLE group_t (
    host_id                   VARCHAR(32) NOT NULL,
    group_id                  VARCHAR(32) NOT NULL,
    group_name                VARCHAR(255) UNIQUE NOT NULL,
    description               TEXT,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, group_id)
);

Group-Endpoint Permission Table:

CREATE TABLE group_permission_t (
    host_id                   VARCHAR(32) NOT NULL,
    group_id                  VARCHAR(32) NOT NULL,
    endpoint_id               VARCHAR(32) NOT NULL,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, group_id, endpoint_id),
    FOREIGN KEY (host_id, group_id) REFERENCES group_t(host_id, group_id) ON DELETE CASCADE,
    FOREIGN KEY (endpoint_id) REFERENCES api_endpoint_t(endpoint_id) ON DELETE CASCADE
);

Group-User Membership Table:

CREATE TABLE group_user_t (
    host_id                   VARCHAR(22) NOT NULL,
    group_id                  VARCHAR(22) NOT NULL,
    user_id                   VARCHAR(22) NOT NULL,
    start_date DATE NOT NULL DEFAULT CURRENT_DATE,
    end_date DATE,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, group_id, user_id, start_date),
    FOREIGN KEY (user_id) REFERENCES user_t(user_id) ON DELETE CASCADE,
    FOREIGN KEY (host_id, group_id) REFERENCES group_t(host_id, group_id) ON DELETE CASCADE
);

D. Attribute-Based Access Control (ABAC)

Attribute Table:

CREATE TABLE attribute_t (
    host_id                   VARCHAR(22) NOT NULL,
    attribute_id              VARCHAR(22) NOT NULL,
    attribute_name            VARCHAR(255) UNIQUE NOT NULL, -- The name of the attribute (e.g., "department," "job_title," "project," "clearance_level," "location").
    attribute_type            VARCHAR(50) CHECK (attribute_type IN ('string', 'integer', 'boolean', 'date', 'float', 'list')), -- Define allowed data types
    description               TEXT,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, attribute_id)
);

Attribute User Table:

CREATE TABLE attribute_user_t (
    host_id                   VARCHAR(22) NOT NULL,
    attribute_id              VARCHAR(22) NOT NULL,
    user_id                   VARCHAR(22) NOT NULL, -- References users_t
    attribute_value           TEXT, -- Store values as strings; you can cast later
    start_date                DATE NOT NULL DEFAULT CURRENT_DATE,
    end_date                  DATE,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, attribute_id, user_id, start_date),
    FOREIGN KEY (user_id) REFERENCES user_t(user_id) ON DELETE CASCADE,
    FOREIGN KEY (host_id, attribute_id) REFERENCES attribute_t(host_id, attribute_id) ON DELETE CASCADE
);

Attribute Permission Table:

CREATE TABLE attribute_permission_t (
    host_id                   VARCHAR(32) NOT NULL,
    attribute_id              VARCHAR(32) NOT NULL,
    endpoint_id               VARCHAR(32) NOT NULL, -- References api_endpoints_t
    attribute_value           TEXT,
    update_user               VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
    update_timestamp          TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    PRIMARY KEY (host_id, attribute_id, endpoint_id),
    FOREIGN KEY (endpoint_id) REFERENCES api_endpoint_t(endpoint_id) ON DELETE CASCADE,
    FOREIGN KEY (host_id, attribute_id) REFERENCES attribute_t(host_id, attribute_id) ON DELETE CASCADE
);

How it Works:

Define Attributes: Define all relevant attributes in attribute_t. Think about all the properties of your users, resources, and environment that might be used in access control decisions.
Assign Attributes to Users: Populate user_attribute_t to associate attribute values with users.
Assign Attributes to Endpoints: Populate attribute_permission_t to associate attribute values with API endpoints.
Write Policies: Create policy rules in rule engine. These rules should use the attribute names defined in attribute_t.
Policy Evaluation (at runtime):

The policy engine receives the subject (user), resource (API endpoint), and action (HTTP method) of the request.
The engine retrieves the relevant attributes from the user_attribute_t and attribute_permission_t tables.
The engine evaluates the policy rule from the relevant policies against the attributes.
Based on the policy evaluation result, access is either granted or denied.

Key Advantages of ABAC:

Fine-Grained Control: Express very specific access rules.
Centralized Policy Management: Policies are stored centrally and can be easily updated.
Flexibility and Scalability: Adapts easily to changing requirements.
Auditing and Compliance: Easier to audit and demonstrate compliance.

JWT Security Claims

Using the tables defined above, follow these steps to create an authorization code token with user security claims:

uid
Assign the user_id to the uid claim in the JWT.
role
Include a list of roles associated with the user.
grp
Add a list of groups the user belongs to.
att
Include a list of key-value pairs representing user attributes.

Group Management and Dynamic Membership

You can create groups that align with teams, departments, or other organizational units. These groups are relatively static and reflect the overall organizational structure. Use a separate table, group_t, as described above to store these groups.

2. Use the Reporting Structure to Derive Dynamic Group Memberships

Instead of directly assigning all users to groups, you can:

Assign base group memberships to individual users (e.g., only non-managers initially).
Use the reporting relationships stored in the report_relationship_t table to infer additional group memberships based on the organizational hierarchy.

3. Retrieval Logic

Get User's Direct Groups:
Retrieve the groups a user is explicitly assigned to from the user_group_t table.
Traverse Up the Reporting Hierarchy:
Use the report_relationship_t table to find all the user's ancestors (managers) in the reporting structure.
Inherit Subordinate Group Memberships:
For each ancestor (manager), retrieve the direct reports' group memberships. Add these groups to the manager's effective group memberships. You can control the depth of inheritance (e.g., only inherit from direct reports or up to a certain level in the hierarchy).
Combine and Deduplicate:
Combine the user's direct group memberships with the inherited memberships, removing any duplicates.

Example

Let's say:

Alice is a manager and belongs to the "Management" group.
Bob reports to Alice and belongs to the "Engineering" group.
Charlie reports to Bob and belongs to the "Engineering" and "Testing" groups.

When Bob's request comes in:

Query result contains Bob's direct group: "Engineering".
Check reporting structure: Bob reports to Alice.
Get group memberships of Bob's direct reports: "Engineering", "Testing". (These are inherited since Bob is Charlie's manager)
Bob's effective groups are now "Engineering", "Testing".

When Alice's request comes in:

Query result contains Alice's direct group: "Management".
Check reporting structure: Bob and Charlie report to Alice.
Get group memberships of Alice's direct reports: "Engineering", "Testing" (inherited from Bob and Charlie).
Alice's effective groups are now "Management", "Engineering", "Testing".

Advantages:

Reduced Administrative Overhead: You don't have to manually manage group memberships for managers as their teams change.
Dynamic Access Control: Permissions adapt automatically as the reporting structure evolves.
Centralized Logic: The inheritance logic is encapsulated in the logic, making it easier to maintain and update.

Implementation

Portal Dashboard

The Portal Dashboard is served by the portal-view single-page application.

Guest User Access:
Upon landing on the dashboard, a guest user can:
- View certain menus.
- Perform limited actions within the application.
Accessing Privileged Features:
To access additional features:
1. Click the User button.
2. Select the Sign In menu item.

Redirection to Login View:
When the Sign In menu item is clicked, the browser is redirected to the Login View single-page application. This application is served by the same instance of light-gateway and handles user authentication against the OAuth 2.0 server (OAuth Kafka) to initiate the Authorization Code grant flow.
OAuth 2.0 Client ID:
The client_id is included in the redirect URL as a query parameter. This ensures that the client_id is sent to the OAuth 2.0 server to obtain the authorization code. In this context, the client_id is associated with the portal-view application.
Login View Responsibilities:
The Login View is a shared single-page application used by all other SPAs across various hosts. It is responsible for:
- Authenticating users.
- Ensuring that user credentials are not passed to any other single-page applications or business APIs.
SaaS Deployment in the Cloud:
In a SaaS environment, all users are authenticated by the OAuth 2.0 server using the light-portal user database. As a result, the user type does not need to be passed from the Login View.
On-Premise Deployment:
For on-premise deployments, a customized Login View should include a radio button for selecting the user type. Typical options for most organizations are:
- Employee (E)
- Customer (C)
Customized Authentication:
Based on the selected user type:
- Employees are authenticated via Active Directory.
- Customers are authenticated using the customer database.
A customized authenticator implementation should handle this logic, ensuring the correct authentication method is invoked for each user type.

Form Submission Endpoint:
/oauth2/N2CMw0HGQXeLvC1wBfln2A/code
Request Details:
- Headers:
  - Content-Type: application/x-www-form-urlencoded
- Method:
  - POST
- Body Parameters:
  - j_username: The user's username.
  - j_password: The user's password.
  - remember: Indicates whether the session should persist.
  - client_id: The OAuth 2.0 client identifier.
  - state: A hardcoded value (requires additional work for dynamic handling).
  - user_type: (Optional) Specifies the type of user (e.g., employee or customer).
  - redirect_uri: (Optional) The URI to redirect after authentication.

Light Gateway

The light-gateway instance acts as a BFF and it has a routing rule to route any request with prefix /oauth2 to kafka-oauth server.

OAuth Kafka

LightPortalAuthenticator

A request to hybrid-query:

{"host":"lightapi.net","service":"user","action":"loginUser","version":"0.1.0","data":{"email":"%s","password":"%s"}}

User Query

LoginUser

This handler calls loginUserByEmail method from PortalDbProviderImpl.

PortalDbProviderImpl

The input for this method is email, password and userType.

Configuration

light-gateway

Client Credentials Token

All the accesses from the light-gateway to the downstream APIs should have at least one token in the Authorization header. If there is an authorization code token in the Authorization header, then a client credentials token will be added to the X-Scope-Token header by the TokenHandler.

Since all light portal services have the same scopes (portal.r and portal.w), one token should be enough for accessing all APIs.

Add the client credentials token config in client.yml section.

# Client Credential
client.tokenCcUri: /oauth2/N2CMw0HGQXeLvC1wBfln2A/token
client.tokenCcClientId: f7d42348-c647-4efb-a52d-4c5787421e72
client.tokenCcClientSecret: f6h1FTI8Q3-7UScPZDzfXA
client.tokenCcScope:
  - portal.r
  - portal.w
  - ref.r
  - ref.w

Add TokenHandler to the handler.yml section.

# handler.yml
handler.handlers:
  .
  .
  .
  - com.networknt.router.middleware.TokenHandler@token
  .
  .
  .
handler.chains.default:
  .
  .
  .
  - prefix
  - token
  - router

Add the TokenHandler configuration token.yml section.

# token.yml
token.enabled: true
token.appliedPathPrefixes:
  - /r

light-reference

Cors Configuration

As the light-gateway is handling the SPA interaction and cors, we don't need to enable the cors on the reference API. However, the cors handler is still registered in the default handler.yml in case the reference API is used as a standalone service.

In the light-portal configuration, we need to disable the cors.

# cors.yml
cors.enabled: false

Client Configuration

We need to load the jwk from the oauth-kafka service to validate the incoming jwk tokens. To set up the jwk, add the following lines to the values.yml file.

# client.yml
client.tokenKeyServerUrl: https://localhost:6881
client.tokenKeyUri: /oauth2/N2CMw0HGQXeLvC1wBfln2A/keys

Light Portal Documentation