Introduction
Architecture
Design
Light Portal is an application that connect the providers to the consumers, and it contains many components or applications. Each component will have some API endpoints and a user interface in the portal view single page application.
To allow the users to understand each component in detail in term of design, we have collected all the design documents in this section.
Multi-Tenant
Database Schema
We're facing a classic multi-tenancy database design decision. Adding a host_id to every table is one approach, but it does lead to composite primary keys and can impact performance. Using UUIDs as primary keys, even in a multi-tenant environment, is another viable option with its own set of trade-offs. Let's examine both strategies:
- Host ID on Every Table (Composite Primary Keys)
Schema: Each table would have a host_id column, and the primary key would be a combination of host_id and another unique identifier (e.g., user_id, endpoint_id).
CREATE TABLE user_t (
host_id UUID NOT NULL, -- References hosts table
user_id INT NOT NULL,
-- ... other columns
PRIMARY KEY (host_id, user_id),
FOREIGN KEY (host_id) REFERENCES hosts_t(host_id)
);
Pros:
-
Data Isolation: Clear separation of data at the database level. Easy to query data for a specific tenant.
-
Backup/Restore: Simplified backup and restore procedures for individual tenants.
Cons:
-
Composite Primary Keys: Can lead to more complex queries, especially joins, as you always need to include the host_id. Can affect query optimizer performance.
-
Storage Overhead: host_id is repeated in every row of every table, adding storage overhead.
-
Index Impact: Composite indexes can sometimes be less efficient than single-column indexes.
- UUIDs as Primary Keys (Shared Tables)
Schema: Tables use UUIDs as primary keys. A separate table (tenant_resources_t) maps UUIDs to tenants.
CREATE TABLE user_t (
user_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
-- ... other columns
);
CREATE TABLE tenant_resource_t(
host_id UUID NOT NULL,
resource_type varchar(255) NOT NULL, --e.g., 'user', 'api_endpoint'
resource_id UUID NOT NULL,
PRIMARY KEY(host_id, resource_type, resource_id),
FOREIGN KEY (host_id) REFERENCES hosts_t(host_id)
);
Pros:
-
Simplified Primary Keys: Easier to manage single-column UUID primary keys. Simpler joins.
-
Reduced Storage Overhead: No need to repeat host_id in every table.
-
Application Logic: Multi-tenancy is handled mostly in the application logic by querying tenant_resources_t to ensure a user belongs to the correct tenant, adding a layer of flexibility. (This is also a con if not carefully implemented.)
Cons:
-
Data Isolation (slightly reduced): Data is logically separated but resides in shared tables. Robust application logic is essential to prevent data leakage between tenants.
-
Backup/Restore (more complex): Backing up/restoring for a single tenant requires filtering based on the tenant_resources_t table.
-
Query Performance (potential issue): Queries might require joining with tenant_resources_t table which will add a bit overhead. Proper indexing and query optimization become crucial.
- Hybrid Approach (Recommended in many cases)
-
Combine the strengths of both approaches.
-
Use UUIDs as primary keys for most tables for simplicity.
-
Add host_id to tables where data isolation is paramount (e.g., tables containing sensitive financial or personal data) or where frequent tenant-specific queries are performed. This provides a more granular level of control and allows for optimized queries when needed.
-
Use the tenant_resources_t table to maintain an overview of which resources belong to which tenant, supporting the application logic's multi-tenancy enforcement.
Choosing the Right Strategy:
-
Small Number of Tenants, Low Data Volume: UUIDs might be sufficient.
-
Strict Data Isolation Requirements, Frequent Tenant-Specific Queries, High Data Volume: host_id on every relevant table or a hybrid approach is often preferable.
-
Flexibility in Multi-Tenancy Implementation: Favors the UUID approach or Hybrid approach, with logic predominantly handled in the application layer.
Additional Considerations:
-
Database Support: Check if your database (PostgreSQL) has specific features for multi-tenancy.
-
Indexing: Carefully plan your indexes for optimal performance in any multi-tenant scenario.
-
Schema Design: Consider creating views for tenant-specific data access to simplify queries in the application layer.
-
Security: Implement robust security measures to prevent data leakage between tenants, regardless of the chosen approach.
The hybrid approach usually provides the best balance between performance, data isolation, and development complexity in real-world multi-tenant applications. You have more direct control where strict isolation is required and can maintain the simplicity of UUIDs where it's beneficial. Using tenant_resources_t for application logic enforcement offers flexibility and a central point of management for tenant resource association. This approach also prepares you for potential future expansion and different multi-tenancy demands as your application grows.
Citus PostgreSQL Extension
Citus, now fully integrated into PostgreSQL as a distributed database extension, can be very helpful in scaling your multi-tenant application, especially if you anticipate significant data growth and high query loads. Here's how Citus can fit into your use case and the factors to consider:
How Citus Helps:
-
Horizontal Scalability: Citus allows you to distribute your data across multiple PostgreSQL nodes (servers), enabling horizontal scaling. This is crucial for handling increasing data volumes and query loads in a multi-tenant environment.
-
Improved Query Performance: By distributing data and queries, Citus can significantly improve the performance of many types of queries, especially analytical queries that operate on large datasets. This is particularly beneficial if you have tenants with substantially different data volumes or query patterns.
-
Shard Placement by Tenant: One of the most effective ways to use Citus for multi-tenancy is to shard your data by host_id (or a tenant ID). This means that all data for a given tenant resides on the same shard (a subset of the distributed database). This allows for efficient tenant isolation and simplifies queries for tenant-specific data.
-
Simplified Multi-Tenant Queries: When sharding by tenant, queries that filter by host_id become very efficient because Citus can route them directly to the appropriate shard. This eliminates the need for expensive scans across the entire database.
-
Flexibility: Citus supports various sharding strategies, allowing you to choose the best approach for your data and query patterns. You can even use a hybrid approach, distributing some tables while keeping others replicated across all nodes for faster access to shared data.
Example (Sharding by Tenant):
Create a distributed table: When creating your tables (e.g., user_t, api_endpoint_t, etc.), you would declare them as distributed tables in Citus, using the host_id as the distribution column:
CREATE TABLE user_t (
host_id UUID NOT NULL,
user_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
-- ... other columns
) DISTRIBUTE BY HASH (host_id);
Querying: When querying data for a specific tenant, include the host_id in your WHERE clause:
SELECT * FROM users_t WHERE host_id = 'your-tenant-id';
Citus will automatically route this query to the shard containing the data for that tenant, resulting in much faster query execution.
Citus Cost:
-
Citus Open Source: The Citus open-source extension is free to use and is included in the PostgreSQL distribution. You can self-host and manage it.
-
Azure CosmosDB for PostgreSQL (Managed Citus): Microsoft offers a fully managed cloud service called Azure CosmosDB for PostgreSQL, which is built on Citus. This service has usage-based pricing, and the cost depends on factors like the number of nodes, storage, and compute resources used. This managed option reduces the operational overhead of managing Citus yourself.
Recommendation:
Don't automatically add host_id to every table just because you're using Citus. Carefully analyze your data model, query patterns, and multi-tenancy requirements.
-
Distribute tables by host_id (tenant ID) when data locality and isolation are paramount, and you want to optimize tenant-specific queries.
-
Consider replicating smaller, frequently joined tables to avoid unnecessary joins and host_id overhead.
-
Use a central mapping table (tenant_resources_t) to manage tenant-resource associations and enforce multi-tenancy rules in your application logic where appropriate.
This more nuanced approach provides a balance between the benefits of distributed data with Citus and avoiding unnecessary complexity or performance overhead from overusing host_id. Choose the Citus deployment model (self-hosted open source or managed cloud service) that best suits your needs and budget.
Primary Key Considerations in a Distributed Citus Environment
When a table includes host_id
(due to sharding requirements), it is important to include host_id
as part of the primary key. This ensures proper functioning and optimization within the Citus distributed database.
-
Distribution Column Requirement
In Citus, the distribution column (e.g.,host_id
) must be part of the primary key. This is essential for routing queries and distributing data correctly across shards. -
Uniqueness Enforcement
- The primary key enforces uniqueness across the entire distributed database.
- For example, if
user_id
is unique only within a tenant (host), then(host_id, user_id)
is required as the primary key to ensure uniqueness across all shards.
-
Data Locality and Co-location
Includinghost_id
in the primary key ensures that all rows for the same tenant (identified by the samehost_id
) are stored together on a single shard. This provides:- Efficient Joins: Joins between tables related to the same tenant can be performed locally on a single shard, avoiding expensive cross-shard data transfers.
- Optimized Queries: Queries filtering by
host_id
are efficiently routed to the appropriate shard.
-
Referential Integrity
If other tables reference theusers_t
table and are also distributed byhost_id
, includinghost_id
in the primary key ofusers_t
is essential to maintain referential integrity across shards.
Reference Table
When building a web application, there would be a lot of dropdown selects in forms. The form itself only cares about the id
and label
list to render the form and only the id
will be submitted to the backend API for single select and several ids
for multiple select.
To save the effort to create many similar tables, we can craete a set of tables for all dropdowns. For some of the reference tables, dropdown should be the same across all hosts and we can set common flag to 'Y' so that they are shared by all hosts. If the dropdown values might be different between hosts, we can create a reference table per host and link the reference table with host in a separate table that support sharding.
Reference Schema
CREATE TABLE ref_host_t (
table_id VARCHAR(22) NOT NULL,
host_id VARCHAR(22) NOT NULL,
PRIMARY KEY (table_id, host_id),
FOREIGN KEY (table_id) REFERENCES ref_table_t (table_id) ON DELETE CASCADE,
FOREIGN KEY (host_id) REFERENCES host (host_id) ON DELETE CASCADE
);
CREATE TABLE ref_table_t (
table_id VARCHAR(22) NOT NULL, -- UUID genereated by Util
table_name VARCHAR(80) NOT NULL, -- Name of the ref table for lookup.
table_desc VARCHAR(1024) NULL,
active CHAR(1) NOT NULL DEFAULT 'Y', -- Only active table returns values
editable CHAR(1) NOT NULL DEFAULT 'Y', -- Table value and locale can be updated via ref admin
common CHAR(1) NOT NULL DEFAULT 'Y', -- The drop down shared across hosts
PRIMARY KEY(table_id)
);
CREATE TABLE ref_value_t (
value_id VARCHAR(22) NOT NULL,
table_id VARCHAR(22) NOT NULL,
value_code VARCHAR(80) NOT NULL, -- The dropdown value
start_time TIMESTAMP NULL,
end_time TIMESTAMP NULL,
display_order INT, -- for editor and dropdown list.
active VARCHAR(1) NOT NULL DEFAULT 'Y',
PRIMARY KEY(value_id),
FOREIGN KEY table_id REFERENCES ref_table_t (table_id) ON DELETE CASCADE
);
CREATE TABLE value_locale_t (
value_id VARCHAR(22) NOT NULL,
language VARCHAR(2) NOT NULL,
value_desc VARCHAR(256) NULL, -- The drop label in language.
PRIMARY KEY(value_id,language),
FOREIGN KEY value_id REFERENCES ref_value_t (value_id) ON DELETE CASCADE
);
CREATE TABLE relation_type_t (
relation_id VARCHAR(22) NOT NULL,
relation_name VARCHAR(32) NOT NULL, -- The lookup keyword for the relation.
relation_desc VARCHAR(1024) NOT NULL,
PRIMARY KEY(relation_id)
);
CREATE TABLE relation_t (
relation_id VARCHAR(22) NOT NULL,
value_id_from VARCHAR(22) NOT NULL,
value_id_to VARCHAR(22) NOT NULL,
active VARCHAR(1) NOT NULL DEFAULT 'Y',
PRIMARY KEY(relation_id, value_id_from, value_id_to)
FOREIGN KEY relation_id REFERENCES relation_type_t (relation_id) ON DELETE CASCADE,
FOREIGN KEY value_id_from REFERENCES ref_value_t (value_id) ON DELETE CASCADE,
FOREIGN KEY value_id_to REFERENCES ref_table_t (value_id) ON DELETE CASCADE
);
Authentication & Authorization
Light-Portal is a single-page application (SPA) that utilizes both the OAuth 2.0 Authorization Code and Client Credentials flows.
The following pattern illustrates the end-to-end process recommended by the Light Platform for an SPA interacting with downstream APIs.
Sequence Diagram
sequenceDiagram participant PortalView as Portal View participant LoginView as Login View participant Gateway as Light Gateway participant OAuthKafka as OAuth-Kafka participant AuthService as Auth Service participant ProxySidecar as Proxy Sidecar participant BackendAPI as Backend API PortalView ->> LoginView: 1. Signin redirect LoginView ->> OAuthKafka: 2. Authenticate user OAuthKafka ->> AuthService: 3. Authenticate User<br/>(Active Directory<br/>for Employees)<br/>(CIF System<br/>for Customers) AuthService ->> OAuthKafka: 4. Authenticated OAuthKafka ->> OAuthKafka: 5. Generate auth code OAuthKafka ->> PortalView: 6. Redirect with code PortalView ->> Gateway: 7. Authorization URL<br/>with code param Gateway ->> OAuthKafka: 8. Create JWT access<br/>token with code OAuthKafka ->> OAuthKafka: 9. Generate JWT<br/>access token<br/>with user claims OAuthKafka ->> Gateway: 10. Token returns<br/>to Gateway Gateway ->> PortalView: 11. Token returns<br/>to Portal View<br/>in Secure Cookie PortalView ->> Gateway: 12. Call Backend API Gateway ->> Gateway: 13. Verify the token Gateway ->> OAuthKafka: 14. Create Client<br/>Credentials token OAuthKafka ->> OAuthKafka: 15. Generate Token<br/>with Scopes OAuthKafka ->> Gateway: 16. Return the<br/>scope token Gateway ->> Gateway: 17. Add scope<br/>token to<br/>X-Scope-Token<br/>Header Gateway ->> ProxySidecar: 18. Invoke API ProxySidecar ->> ProxySidecar: 19. Verify<br/>Authorization<br/>token ProxySidecar ->> ProxySidecar: 20. Verify<br/>X-Scope-Token ProxySidecar ->> ProxySidecar: 21. Fine-Grained<br/>Authorization ProxySidecar ->> BackendAPI: 22. Invoke<br/>business API BackendAPI ->> ProxySidecar: 23. Business API<br/>response ProxySidecar ->> ProxySidecar: 24. Fine-Grained<br/>response filter ProxySidecar ->> Gateway: 25. Return response Gateway ->> PortalView: 26. Return response
-
When a user visits the website to access the single-page application (SPA), the Light Gateway serves the SPA to the user's browser. Each single page application will have a dedicated Light Gateway instance acts as a BFF. By default, the user is not logged in and can only access limited site features. To unlock additional features, the user can click the
User
button in the header and select theSign In
menu. This action redirects the browser from the Portal View to the Login View, both served by the same Light Gateway instance. -
On the Login View page, the user can either input a username and password or choose Google/Facebook for authentication. When the login form is submitted, the request is sent to the Light Gateway with the user's credentials. The Gateway forwards this request to the OAuth Kafka service.
-
OAuth Kafka supports multiple authenticator implementations to verify user credentials. Examples include authenticating via the Light Portal user database, Active Directory for employees, or CIF service for customers.
-
Once authentication is successfully completed, the OAuth Kafka responds with the authentication result.
-
Upon successful authentication, OAuth Kafka generates an authorization code (a UUID associated with the user's profile).
-
OAuth Kafka redirects the authorization code back to the browser at the Portal View via the Gateway.
-
Since the Portal View SPA lacks a dedicated redirect route for the authorization code, the browser sends the code as a query parameter in a request to the Gateway.
-
The
StatelessAuthHandler
in the Gateway processes this request, initiating a token request to OAuth Kafka to obtain a JWT access token. -
OAuth Kafka generates an access token containing user claims in its custom JWT claims. The authorization code is then invalidated, as it is single-use.
-
The access token is returned to the Gateway.
-
The
StatelessAuthHandler
in the Gateway stores the access token in a secure cookie and sends it back to the Portal View. -
When the Portal View SPA makes requests to backend APIs, it includes the secure cookie in the API request sent to the Gateway.
-
The
StatelessAuthHandler
in the Gateway validates the token in the secure cookie and places it in theAuthorization
header of the outgoing request. -
If the token is successfully validated, the
TokenHandler
in the Gateway makes a request to OAuth Kafka for a client credentials token, using the path prefix of the API endpoint. -
OAuth Kafka generates a client credentials token with the appropriate scope for accessing the downstream service.
-
The client credentials token is returned to the Gateway.
-
The
TokenHandler
in the Gateway inserts this token into theX-Scope-Token
header of the original request. -
The Gateway routes the original request, now containing both tokens, to the downstream
proxy sidecar
of the backend API. -
The proxy sidecar validates the
Authorization
token, verifying its signature, expiration, and other attributes. -
The proxy sidecar also validates the
X-Scope-Token
, ensuring its signature, expiration, and scope are correct. -
Once both tokens are successfully validated, the proxy sidecar enforces fine-grained authorization rules based on the user's custom security profile contained in the
Authorization
token. -
If the fine-grained authorization checks are passed, the proxy sidecar forwards the request to the backend API.
-
The backend API processes the request and sends the full response back to the
proxy sidecar
. -
The proxy sidecar applies fine-grained filters to the response, reducing the number of rows and/or columns based on the user's security profile or other policies.
-
The proxy sidecar returns the filtered response to the Gateway.
-
The Gateway forwards the response to the Portal View, allowing the SPA to render the page.
Fine-Grained Authorization
What is Fine-Grained Authorization?
Fine-grained authorization (FGA) refers to a detailed and precise control mechanism that governs access to resources based on specific attributes, roles, or rules. It's also known as fine-grained access control (FGAC). Unlike coarse-grained authorization, which applies broader access policies (e.g., "Admins can access everything"), fine-grained authorization allows for more specific policies (e.g., "Admins can access user data only if they belong to the same department and the access request is during business hours").
Key Features
- Granular Control: Policies are defined at a detailed level, considering attributes like user role, resource type, action, time, location, etc.
- Context-Aware: Takes into account dynamic conditions such as the time of request, user’s location, or other contextual factors.
- Flexible Policies: Allows the creation of complex, conditional rules tailored to the organization’s needs.
Why Do We Need Fine-Grained Authorization?
1. Enhanced Security
By limiting access based on detailed criteria, fine-grained authorization minimizes the risk of unauthorized access or data breaches.
2. Regulatory Compliance
It helps organizations comply with legal and industry-specific regulations (e.g., GDPR, HIPAA) by ensuring sensitive data is only accessible under strict conditions.
3. Minimized Attack Surface
By restricting access to only the required resources and operations, fine-grained authorization reduces the potential impact of insider threats or compromised accounts.
4. Improved User Experience
Enables personalized access based on roles and permissions, ensuring users see only what they need, which reduces confusion and improves productivity.
5. Auditing and Accountability
Detailed access logs and policy enforcement make it easier to track and audit who accessed what, when, and why, fostering better accountability.
Examples of Use Cases
- Healthcare: A doctor can only view records of patients they are treating.
- Government: A government employee can access to data and documents based on security clearance levels and job roles.
- Finance: A teller can only access transactions related to their assigned branch.
- Enterprise Software: Employees can edit documents only if they own them or have been granted editing permissions.
Fine-Grained Authorization in API Access Control
In API access control, fine-grained authorization governs how users or systems interact with specific API endpoints, actions, and data. This approach ensures that access permissions are precisely tailored to attributes, roles, and contextual factors, enabling a secure and customized API experience. As the Light Portal is a platform centered on APIs, the remainder of the design will focus on the API access control context.
Early Approaches to Fine Grained Authorization
Early approaches to fine grained authorization primarily involved Access Control Lists (ACLs) and Role-Based Access Control (RBAC). These methods laid the foundation for more sophisticated access control mechanisms that followed. Here's an overview of these primary approaches:
Access Control Lists (ACLs):
-
ACLs were one of the earliest forms of fine grained authorization, allowing administrators to specify access permissions on individual resources for each user or group of users.
-
In ACLs, permissions are directly assigned to users or groups, granting or denying access to specific resources based on their identities.
-
While effective for small-scale environments with limited resources and users, ACLs became cumbersome as organizations grew. Maintenance issues arose, such as the time required to manage access to an increasing number of resources for numerous users.
Role-Based Access Control (RBAC):
-
RBAC emerged as a solution to the scalability and maintenance challenges posed by ACLs. It introduced the concept of roles, which represent sets of permissions associated with particular job functions or responsibilities.
-
Users are assigned one or more roles, and their access permissions are determined by the roles they possess rather than their individual identities.
-
RBAC can be implemented with varying degrees of granularity. Roles can be coarse-grained, providing broad access privileges, or fine-grained, offering more specific and nuanced permissions based on organizational needs.
-
Initially, RBAC appeared to address the limitations of ACLs by providing a more scalable and manageable approach to access control.
Both ACLs and RBAC have their shortcomings:
-
Maintenance Challenges: While RBAC offered improved scalability compared to ACLs, it still faced challenges with role management as organizations expanded. The proliferation of roles, especially fine grained ones, led to a phenomenon known as role explosion where the number of roles grew rapidly, making them difficult to manage effectively.
-
Security Risks: RBAC's flexibility also posed security risks. Over time, users might accumulate permissions beyond what they need for their current roles, leading to a phenomenon known as permission creep. This weakened overall security controls and increased the risk of unauthorized access or privilege misuse.
Following the discussion of early approaches to fine grained authorization, it's crucial to acknowledge that different applications have varying needs for authorization.
Whether to use fine grained or coarse-grained controls depends on the specific project. Controlling access becomes trickier due to the spread-out nature of resources and differing levels of detail needed across components. Let’s delve into the differentiating factors:
Standard Models for Implementing FGA
There are several standard models for implementing FGA:
-
Attribute-Based Access Control (ABAC)
: In ABAC, access control decisions are made by evaluating attributes such as user roles, resource attributes (e.g., type, size, status), requested action, current date and time, and any other relevant contextual information. ABAC allows for very granular control over access based on a wide range of attributes. -
Policy-Based Access Control (PBAC)
: PBAC is similar to ABAC but focuses more on defining policies than directly evaluating attributes. Policies in PBAC typically consist of rules or logic that dictate access control decisions based on various contextual factors. While ABAC relies heavily on data (attributes), PBAC emphasizes using logic to determine access. -
Relationship-Based Access Control (ReBAC): ReBAC emphasizes the relationships between users and resources, as well as relationships between different resources. By considering these relationships, ReBAC provides a powerful and expressive model for describing complex authorization contexts. This can involve the attributes of users and resources and their interactions and dependencies.
Each of these models offers different strengths and may be more suitable for different scenarios. FGA allows for fine grained control over access, enabling organizations to enforce highly specific access policies tailored to their requirements.
Streamlining FGA by Implementing Rule-Based Access Control:
ABAC (Attribute-Based Access Control) focuses on data attributes, PBAC (Policy-Based Access Control) centers on logic, and ReBAC (Relationship-Based Access Control) emphasizes relationships between users and resources. But what if we combined all three to leverage the strengths of each? This is the idea behind Rule-Based Access Control (RuBAC).
By embedding a lightweight rule engine, we can integrate multiple rules and actions to achieve the following:
-
Optimize ABAC: Reduce the number of required attributes since not all rules depend on them. For example, a standard rule like "Customer data can only be accessed during working hours" can be shared across policies.
-
Flexible Policy Enforcement: Using a rule engine makes access policies more dynamic and simpler to manage.
-
Infer Relationships: Automatically deduce relationships between entities. For instance, the rule engine could grant a user access to a file if they already have permission for the containing folder.
Principle of Least Privilege
The principle of least privilege access control widely referred to as least privilege, and PoLP is the security concept in which user(s) (employee(s)) are granted the minimum level of access/permissions to the app, data, or system that is required to perform his/her job functions.
To ensure PoLP is effectively enforced, we've compiled a list of best practices:
-
Conduct a thorough privilege audit: As we know, visibility is critical in an access environment, so conducting regular or periodic access audits of all privileged accounts can help your team gain complete visibility. This audit includes reviewing privileged accounts and credentials held by employees, contractors, and third-party vendors, whether on-premises, accessible remotely, or in the cloud. However, your team must also focus on default and hard-coded credentials, which IT teams often overlook.
-
Establish the least privilege as the default: Start by granting new accounts the minimum privileges required for their tasks and eliminate or reconfigure default permissions on new systems or applications. Further, use role-based access control to help your team determine the necessary privileges for a new account by providing general guidelines based on roles and responsibilities. Also, your team needs to update and adjust access level permissions when the user's role changes; this will help prevent privilege creep.
-
Enforce separation of privileges: Your team can prevent over-provisioning by limiting administrator privileges. Firstly, segregate administrative accounts from standard accounts, even if they belong to the same user, and isolate privileged user sessions. Then, grant administrative privileges (such as read, write, and execute permissions) only to the extent necessary for the user to perform their specific administrative tasks. This will help your team prevent granting users unnecessary or excessive control over critical systems, which could lead to security vulnerabilities or misconfigurations.
-
Provide just-in-time, limited access: To maintain least-privilege access without hindering employee workflows, combine role-based access control with time-limited privileges. Further, replace hard-coded credentials with dynamic secrets or use one-time-use/temporary credentials. This will help your team grant temporary elevated access permissions when users need it, for instance, to complete specific tasks or short-term projects.
-
Keep track and evaluate privileged access: Continuously monitor authentications and authorizations across your API platform and ensure all the individual actions are traceable. Additionally, record all authentication and authorizaiton sessions comprehensively, and use automated tools to swiftly identify any unusual activity or potential issues. These best practices are designed to enhance the security of your privileged accounts, data, and assets while ensuring compliance adherence and improving operational security without disrupting user workflows.
OpenAPI Specification Extensions
OpenAPI uses the term security scheme for authentication and authorization schemes. OpenAPI 3.0 lets you describe APIs protected using the following security schemes. The fine-grained authorization is just another layer of security and it is natural to define the fine-grained authorization in the same specification. It is can be done with OpenAPI specification extensions.
Extensions (also referred to as specification extensions or vendor extensions) are custom properties that start with x-, such as x-logo. They can be used to describe extra functionality that is not covered by the standard OpenAPI Specification. Many API-related products that support OpenAPI make use of extensions to document their own attributes, such as Amazon API Gateway, ReDoc, APIMatic and others.
As OpenAPI specification openapi.yaml is loaded during the light-4j startup, the extensions will be available at runtime in cache for each endpoint just like the scopes definition. The API owner can define the following two extensions for each endpoint:
-
x-request-access: This section allows designer to specify one or more rules as well as one or more security attributes for the input of the rules. For example, roles, location etc. The rule result will decide if the user has access to the endpoint based on the security attributes from the JWT token in the request chain.
-
x-response-filter: This section is similar to the above; however, it works on the response chain. The rule result will decide which row or column of the response JSON will return to the user based on the security profile from the JWT token.
Example of OpenAPI specification with fine-grained authorization.
paths:
/accounts:
get:
summary: "List all accounts"
operationId: "listAccounts"
x-request-access:
rule: "account-cc-group-role-auth"
roles: "manager teller customer"
x-response-filter:
rule: "account-row-filter"
teller:
status: open
customer:
status: open
owner: @user_id
rule: "account-col-filter"
teller: ["num","owner","type","firstName","lastName","status"]
customer: ["num","owner","type","firstName","lastName"]
security:
- account_auth:
- "account.r"
FGA Rules for AccessControlHandler
With the above specification loaded during the runtime, the rules will be loaded during the server startup for the service as well. In the Rule Registry on the light-portal, we have a set of built-in rules that can be picked as fine-grained policies for each API. Here is an example of rule for the above specification in the x-request-access.
account-cc-group-role-auth:
ruleId: account-cc-group-role-auth
host: lightapi.net
description: Role-based authorization rule for account service and allow cc token and transform group to role.
conditions:
- conditionId: allow-cc
variableName: auditInfo
propertyPath: subject_claims.ClaimsMap.user_id
operatorCode: NIL
joinCode: OR
index: 1
- conditionId: manager
variableName: auditInfo
propertyPath: subject_claims.ClaimsMap.groups
operatorCode: CS
joinCode: OR
index: 2
conditionValues:
- conditionValueId: manager
conditionValue: admin
- conditionId: teller
variableName: auditInfo
propertyPath: subject_claims.ClaimsMap.groups
operatorCode: CS
joinCode: OR
index: 3
conditionValues:
- conditionValueId: teller
conditionValue: frontOffice
- conditionId: allow-role-jwt
variableName: auditInfo
propertyPath: subject_claims.ClaimsMap.roles
operatorCode: NNIL
joinCode: OR
index: 4
actions:
- actionId: match-role
actionClassName: com.networknt.rule.FineGrainedAuthAction
actionValues:
- actionValueId: roles
value: $roles
All rules are managed by the light-portal and shared by all the services. In addition, developers can create their customized rules for their own services.
JSON Schema Registry
JSON Schema is a declarative language that provides a standardized way to describe and validate JSON data.
What it does
JSON Schema defines the structure, content, data types, and constraints of JSON documents. It's an IETF standard that helps ensure the consistency and integrity of JSON data across applications.
How it works
JSON Schema uses keywords to define data properties. A JSON Schema validator checks if JSON documents conform to the schema.
What it's useful for
- Describing existing data formats
- Validating data as part of automated testing
- Submitting client data
- Defining how a record should be organized
What is a JSON Schema Registry
The JSON Schema Registry provides a centralized service for your JSON schemas with RESTful endpoints for storing and retrieving JSON schemas.
When using data in a distributed application with many RESTful APIs, it is important to ensure that it is well-formed and structured. If data is sent without prior validation, errors may occur on the services. A schema registry provides a way to ensure that the data is validated before it is sent and validated after it is received.
A schema registry is a service used to define and confirm the structure of data that is sent between consumers and providers. In a schema registry, developers can define what the data should look like and how it should be validated. The schemas can be utilized in the OpenAPI specifications to ensure that schemas can be externalized.
Schema records can also help ensure forward and backward compatibility when changes are made to the data structure. When a schema record is used, the data transfered with more schema information that can be used to ensure that applications reading the data can interpret it.
Given the API consumers and providers can belong to different groups or organizations, it is necessary to have a centralized service to manage the schemas so that they can be shared between them. This is why we have implemented this service as part of the light-portal.
Schema Specification Version
The registry is heterogeneous registry as it can store schemas of different schema draft versions. By default the registry is configured to store schemas of Draft 2020-12. When a schema is added, the version which is currently is set, is what the schema is saved as.
The following list contains all supported specification versions.
- Draft 4
- Draft 6
- Draft 7
- 2019-09
- 2020-12
Schema Version
Once a schema is registed into the registry, it will be assigned as version 1. Each time it is updated, the version number will increase 1. When the schema is retrieve, the version number can be part of the URL to indicate that exact version will be retrieved. If version number is not in the URL, the latest version will be retrieved.
Access Endpoint
Table Structure
Light Controller
YAML Rule Registry
React Schema Form
React Schema Form is a form generator based on JSON Schema and form definitions from Light Portal. It renders UI forms to manipulate database entities, and form submissions are automatically hooked into an API endpoint.
Debugging a Component
Encountering a bug in a react-schema-form
component can be challenging since the source code may not be directly visible. To debug:
- Set up the Light Portal server if dropdowns are loaded from the server.
- Use the example app in the same project to debug.
Use a Local Alias with Vite
Vite allows creating an alias to point to your library's src
folder. Update the vite.config.ts
in your example app:
import { defineConfig } from 'vite';
import react from '@vitejs/plugin-react';
import path from 'path';
export default defineConfig({
plugins: [react()],
resolve: {
alias: {
'react-schema-form': path.resolve(__dirname, '../src'), // Adjust the path to point to the library's `src` folder
},
},
});
Use a Link Script in package.json
Update the example app's package.json
file. In the dependencies
section, replace the library's version with a local path:
{
"dependencies": {
"react-schema-form": "file:../src"
}
}
Library Entry Point
Vite requires an entry point file, typically named index.js
or index.ts
, in your library's src
folder. Ensure that your library's src
folder includes a properly configured index.js
file, like this:
export { default as SchemaForm } from './SchemaForm'
export { default as ComposedComponent } from './ComposedComponent'
export { default as utils } from './utils'
export { default as Array } from './Array'
Without a correctly named and configured entry file, components like SchemaForm
may not be imported properly.
Update index.html
If you change the entry point file from main.js
to index.js
, ensure you update the reference in the index.html
file located in the root folder. For example:
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/vite.svg" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Vite + React</title>
</head>
<body>
<div id="root"></div>
<script type="module" src="/src/index.js"></script>
</body>
</html>
Sync devDependencies
from peerDependencies
When the source code in src
is used directly by the example app, the peerDependencies
in the example app won't work for react-schema-form
components. To address this, copy the peerDependencies
into the devDependencies
section of react-schema-form
's package.json
. For example:
"devDependencies": {
"@babel/runtime": "^7.26.0",
"@codemirror/autocomplete": "^6.18.2",
"@codemirror/language": "^6.10.6",
"@codemirror/lint": "^6.8.2",
"@codemirror/search": "^6.5.7",
"@codemirror/state": "^6.4.1",
"@codemirror/theme-one-dark": "^6.1.2",
"@codemirror/view": "^6.34.2",
"@emotion/react": "^11.13.5",
"@emotion/styled": "^11.13.5",
"@eslint/js": "^9.13.0",
"@lezer/common": "^1.2.3",
"@mui/icons-material": "^6.1.6",
"@mui/material": "^6.1.6",
"@mui/styles": "^6.1.6",
"@types/react": "^18.3.1",
"@uiw/react-markdown-editor": "^6.1.2",
"@vitejs/plugin-react": "^4.3.3",
"codemirror": "^6.0.1",
"eslint": "^9.13.0",
"eslint-plugin-react": "^7.37.2",
"eslint-plugin-react-hooks": "^5.0.0",
"eslint-plugin-react-refresh": "^0.4.14",
"gh-pages": "^6.2.0",
"globals": "^15.11.0",
"react": "^18.3.1",
"react-dom": "^18.3.1",
"vite": "^6.0.3"
},
"peerDependencies": {
"@babel/runtime": "^7.26.0",
"@codemirror/autocomplete": "^6.18.2",
"@codemirror/language": "^6.10.6",
"@codemirror/lint": "^6.8.2",
"@codemirror/search": "^6.5.7",
"@codemirror/state": "^6.4.1",
"@codemirror/theme-one-dark": "^6.1.2",
"@codemirror/view": "^6.34.2",
"@emotion/react": "^11.13.5",
"@emotion/styled": "^11.13.5",
"@lezer/common": "^1.2.3",
"@mui/icons-material": "^6.1.6",
"@mui/material": "^6.1.6",
"@mui/styles": "^6.1.6",
"@types/react": "^18.3.1",
"@uiw/react-markdown-editor": "^6.1.2",
"codemirror": "^6.0.1",
"react": "^18.3.1",
"react-dom": "^18.3.1"
},
Additionally, ensure the peerDependencies
are also synced with the dependencies
section of the example app's package.json
. This step allows react-schema-form
components to load independently and work seamlessly during development.
Update Source Code
After completing all the updates, perform a clean install for both react-schema-form
and the example app. Then, start the server from the example folder using the following command:
yarn dev
Whenever you modify a react-schema-form
component, simply refresh the browser to reload the example application and see the updated component in action.
Debug with Visual Studio Code
You can debug the component using Visual Studio Code. There are many tutorials available online that explain how to debug React applications built with Vite, which can help you set up breakpoints, inspect components, and track down issues effectively.
Component dynaselect
dynaselect
is a component that renders a dropdown select, either from static options or options loaded dynamically from a server via an API endpoint. It is a wrapper of material ui Autocomplete component. Below is an example form from the example app that demonstrates how to use this component.
{
"schema": {
"type": "object",
"title": "React Component Autocomplete Demo Static Single",
"properties": {
"name": {
"title": "Name",
"type": "string",
"default": "Steve"
},
"host": {
"title": "Host",
"type": "string"
},
"environment": {
"type": "string",
"title": "Environment",
"default": "LOCAL",
"enum": [
"LOCAL",
"SIT1",
"SIT2",
"SIT3",
"UAT1",
"UAT2"
]
},
"stringarraysingle": {
"type": "array",
"title": "Single String Array",
"items": {
"type": "string"
}
},
"stringcat": {
"type": "string",
"title": "Joined Strings"
},
"stringarraymultiple": {
"type": "array",
"title": "Multiple String Array",
"items": {
"type": "string"
}
}
},
"required": [
"name",
"environment"
]
},
"form": [
"name",
{
"key": "host",
"type": "dynaselect",
"multiple": false,
"action": {
"url": "https://localhost/portal/query?cmd=%7B%22host%22%3A%22lightapi.net%22%2C%22service%22%3A%22user%22%2C%22action%22%3A%22listHost%22%2C%22version%22%3A%220.1.0%22%7D"
}
},
{
"key": "environment",
"type": "dynaselect",
"multiple": false,
"options": [
{
"id": "LOCAL",
"label": "Local"
},
{
"id": "SIT1",
"label": "SIT1"
},
{
"id": "SIT2",
"label": "SIT2"
},
{
"id": "SIT3",
"label": "SIT3"
},
{
"id": "UAT1",
"label": "UAT1"
},
{
"id": "UAT2",
"label": "UAT2"
}
]
},
{
"key": "stringarraysingle",
"type": "dynaselect",
"multiple": false,
"options": [
{
"id": "id1",
"label": "label1"
},
{
"id": "id2",
"label": "label2"
},
{
"id": "id3",
"label": "label3"
},
{
"id": "id4",
"label": "label4"
},
{
"id": "id5",
"label": "label5"
},
{
"id": "id6",
"label": "label6"
}
]
},
{
"key": "stringcat",
"type": "dynaselect",
"multiple": true,
"options": [
{
"id": "id1",
"label": "label1"
},
{
"id": "id2",
"label": "label2"
},
{
"id": "id3",
"label": "label3"
},
{
"id": "id4",
"label": "label4"
},
{
"id": "id5",
"label": "label5"
},
{
"id": "id6",
"label": "label6"
}
]
},
{
"key": "stringarraymultiple",
"type": "dynaselect",
"multiple": true,
"options": [
{
"id": "id1",
"label": "label1"
},
{
"id": "id2",
"label": "label2"
},
{
"id": "id3",
"label": "label3"
},
{
"id": "id4",
"label": "label4"
},
{
"id": "id5",
"label": "label5"
},
{
"id": "id6",
"label": "label6"
}
]
}
]
}
Dynamic Options from APIs
The host
is a string type field rendered as a dynaselect
with multiple
set to false
. The options for the select are loaded via an API endpoint, with the action URL provided. Note that the cmd
query parameter value is encoded because it contains curly brackets {}
.
To encode and decode the query parameter value, you can use the following tool:
Encoded:
%7B%22host%22%3A%22lightapi.net%22%2C%22service%22%3A%22user%22%2C%22action%22%3A%22listHost%22%2C%22version%22%3A%220.1.0%22%7D
Decoded:
{"host":"lightapi.net","service":"user","action":"listHost","version":"0.1.0"}
When using the example app to test the react-schema-form
with APIs, you need to configure CORS on the light-gateway
. Ensure that CORS is enabled only on the light-gateway
and not on the backend API, such as hybrid-query
.
Here is the example in values.yml for the light-gateway.
# cors.yml
cors.enabled: true
cors.allowedOrigins:
- https://devsignin.lightapi.net
- https://dev.lightapi.net
- https://localhost:3000
- http://localhost:5173
cors.allowedMethods:
- GET
- POST
- PUT
- DELETE
Single string type
For the environment
field, the schema defines the type as string
, and the form definition specifies multiple: false
to indicate it is a single select.
The select result in the model looks like the following:
{
"environment": "SIT1",
}
Single string array type
For the stringarraysingle
field, the schema defines the type as a string array, and the form definition specifies multiple: false
to indicate it is a single select.
The select result in the model looks like the following:
{
"stringarraysingle": [
"id3"
],
}
Multiple string type
For the stringcat
field, the schema defines the type as a string
, and the form definition specifies multiple: true
to indicate it is a multiple select.
The select result in the model looks like the following:
{
"stringcat": "id2,id4"
}
Multiple string array type
For the stringarraymultiple
field, the schema defines the type as a string array, and the form definition specifies multiple: true
to indicate it is a multiple select.
The select result in the model looks like the following:
{
"stringarraymultiple": [
"id2",
"id5",
"id3"
],
}
User Management
User Type
The user_type
field is a critical part of the user security profile in the JWT token and can be leveraged for fine-grained authorization. In a multi-tenant environment, user_type
is presented as a dropdown populated from the reference table configured for the organization. It can be dynamically selected based on the host
chosen during the user registration process.
Supported Standard Dropdown Models
-
Employee and Customer
- Dropdown values:
E
(Employee),C
(Customer) - Default model for
lightapi.net
host. - Suitable for most organizations.
- Dropdown values:
-
Employee, Personal, and Business
- Dropdown values:
E
(Employee)P
(Personal)B
(Business)
- Commonly used for banks where personal and business banking are separated.
- Dropdown values:
Database Configuration
- The
user_type
field is nullable in theuser_t
table by default. - However, you can enforce this field as mandatory in your application via the schema and UI configuration.
On-Prem Deployment
In on-premise environments, the user_type
can determine the authentication method:
- Employees: Authenticated via Active Directory.
- Customers: Authenticated via a customer database.
This flexibility allows organizations to tailor the authentication process based on their specific needs and user classifications.
Handling Users with Multi-Host Access
There are two primary ways to handle users who belong to multiple hosts:
- User-Host Mapping Table:
user_t: This table would not have a host_id and would store core user information that is host-independent. The user_id would be unique across all hosts.
user_host_t (or user_tenant_t): This would be a mapping table to represent the many-to-many relationship between users and hosts.
-- user_t (no host_id, globally unique user_id)
CREATE TABLE user_t (
user_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), -- UUID is recommended
-- ... other user attributes (e.g., name, email)
);
-- user_host_t (mapping table)
CREATE TABLE user_host_t (
user_id UUID NOT NULL,
host_id UUID NOT NULL,
-- ... other relationship-specific attributes (e.g., roles within the host)
PRIMARY KEY (user_id, host_id),
FOREIGN KEY (user_id) REFERENCES user_t (user_id) ON DELETE CASCADE,
FOREIGN KEY (host_id) REFERENCES host_t (host_id) ON DELETE CASCADE -- Assuming you have a hosts_t
);
- Duplicating User Records (Less Recommended):
user_t: You would keep host_id in this table, and the primary key would be (host_id, user_id).
User Duplication: If a user needs access to multiple hosts, you would duplicate their user record in users_t for each host they belong to, each with a different host_id.
Why User-Host Mapping is Generally Preferred:
-
Data Integrity: Avoids data duplication and the potential for inconsistencies that come with it. If a user's core information (e.g., name, email) changes, you only need to update it in one place in user_t.
-
Flexibility: Easier to add or remove a user's access to hosts without affecting their core user data.
-
Querying: While you'll need joins to get a user's hosts or a host's users, these joins are straightforward using the mapping table.
-
Scalability: Better scalability as your user base and the number of hosts they can access grow.
Distributing Tables in a Multi-Host User Scenario:
With the user-host mapping approach:
-
user_t: This table would likely be a reference table in Citus (replicated to all nodes) since it does not have a host_id for distribution.
-
user_host_t: This table would be distributed by host_id.
-
Other tables (e.g., employees_t, api_endpoints_t, etc.): These would be distributed by host_id as before.
When querying, you would typically:
-
Start with the user_hosts_t table to find the hosts a user has access to.
-
Join with other tables (distributed by host_id) based on the host_id to retrieve tenant-specific data.
Choosing the Right user_id Primary Key:
Here's a comparison of the options for the user_id primary key in user_t:
1. UUID (user_id
)
- Pros:
- Globally Unique: Avoids collisions across hosts or when scaling beyond the current setup.
- Security: Difficult to guess or enumerate.
- Scalability: Well-suited for distributed environments like Citus.
- Cons:
- Storage: Slightly larger storage size compared to integers.
- Readability: Not human-readable, which can be inconvenient for debugging.
- Recommendation:
This is generally the best option for auser_id
in a multi-tenant, distributed environment.
2. Email (email
)
- Pros:
- Human-Readable: Easy to identify and manage.
- Login Identifier: Often used as a natural login credential.
- Cons:
- Uniqueness Challenges: Enforcing global uniqueness across all hosts may require complex constraints or application logic.
- Changeability: If emails change, cascading updates can complicate the database.
- Security: Using emails as primary keys can expose sensitive user data if not handled securely.
- Performance: String comparisons are slower than those for integers or UUIDs.
- Recommendation:
Not recommended as a primary key, especially in a multi-tenant or distributed setup.
3. User-Chosen Unique ID (e.g., username
)
- Pros:
- Human-Readable: Intuitive and user-friendly.
- Cons:
- Uniqueness Challenges: Enforcing global uniqueness is challenging and may require complex constraints.
- Changeability: Users may request username changes, causing cascading update issues.
- Security: Usernames are easier to guess or enumerate compared to UUIDs.
- Recommendation:
Not recommended as a primary key in a multi-tenant, distributed environment.
In Conclusion:
-
Use a User-Host Mapping Table:
This is the best approach to handle users who belong to multiple hosts in a multi-tenant Citus environment. -
Use UUID for
user_id
:
UUIDs are the most suitable option for theuser_id
primary key inuser_t
due to their global uniqueness, security, and scalability. -
Distribute by
host_id
:
Distribute tables that need sharding byhost_id
, and ensure that foreign keys to distributed tables includehost_id
. -
Use Reference Tables:
For tables likeuser_t
that don't have ahost_id
, designate them as reference tables in Citus.
This approach provides a flexible and scalable foundation for managing users with multi-host access in your Citus-based multi-tenant application.
User Tables
Using a single user_t
table with a user_type
discriminator is a good approach for managing both employees and customers in a unified way. Adding optional referral relationships for customers adds a nice dimension as well. Here's a suggested table schema in PostgreSQL, along with explanations and some considerations:
user_t (User Table): This table will store basic information common to both employees and customers.
CREATE TABLE user_t (
user_id VARCHAR(24) NOT NULL,
email VARCHAR(255) NOT NULL,
password VARCHAR(1024) NOT NULL,
language CHAR(2) NOT NULL,
first_name VARCHAR(32) NULL,
last_name VARCHAR(32) NULL,
user_type CHAR(1) NULL, -- E employee C customer or E employee P personal B business
phone_number VARCHAR(20) NULL,
gender CHAR(1) NULL,
birthday DATE NULL,
country VARCHAR(3) NULL,
province VARCHAR(32) NULL,
city VARCHAR(32) NULL,
address VARCHAR(128) NULL,
post_code VARCHAR(16) NULL,
verified BOOLEAN NOT NULL DEFAULT false,
token VARCHAR(64) NULL,
locked BOOLEAN NOT NULL DEFAULT false,
nonce BIGINT NOT NULL DEFAULT 0,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL
);
ALTER TABLE user_t ADD CONSTRAINT user_pk PRIMARY KEY ( user_id );
ALTER TABLE user_t ADD CONSTRAINT user_email_uk UNIQUE ( email );
user_host_t (User to host relationship or mapping):
CREATE TABLE user_host_t (
host_id VARCHAR(24) NOT NULL,
user_id VARCHAR(24) NOT NULL,
-- other relationship-specific attributes (e.g., roles within the host)
PRIMARY KEY (host_id, user_id),
FOREIGN KEY (user_id) REFERENCES user_t (user_id) ON DELETE CASCADE,
FOREIGN KEY (host_id) REFERENCES host_t (host_id) ON DELETE CASCADE
);
employee_t (Employee Table): This table will store employee-specific attributes.
CREATE TABLE employee_t (
host_id VARCHAR(22) NOT NULL,
employee_id VARCHAR(50) NOT NULL, -- Employee ID or number or ACF2 ID. Unique within the host.
user_id VARCHAR(22) NOT NULL,
title VARCHAR(255) NOT NULL,
manager_id VARCHAR(50), -- manager's employee_id if there is one.
hire_date DATE,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, employee_id),
FOREIGN KEY (host_id, user_id) REFERENCES user_host_t(host_id, user_id) ON DELETE CASCADE,
FOREIGN KEY (host_id, manager_id) REFERENCES employee_t(host_id, employee_id) ON DELETE CASCADE
);
customer_t (Customer Table): This table will store customer-specific attributes.
CREATE TABLE customer_t (
host_id VARCHAR(24) NOT NULL,
customer_id VARCHAR(50) NOT NULL,
user_id VARCHAR(24) NOT NULL,
-- Other customer-specific attributes
referral_id VARCHAR(22), -- the customer_id who refers this customer.
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, customer_id),
FOREIGN KEY (host_id, user_id) REFERENCES user_host_t(host_id, user_id) ON DELETE CASCADE,
FOREIGN KEY (host_id, referral_id) REFERENCES customer_t(host_id, customer_id) ON DELETE CASCADE
);
position_t (Position Table): Defines different positions within the organization for employees.
CREATE TABLE position_t (
host_id VARCHAR(22) NOT NULL,
position_id VARCHAR(22) NOT NULL,
position_name VARCHAR(255) UNIQUE NOT NULL,
description TEXT,
inherit_to_ancestor BOOLEAN DEFAULT FALSE,
inherit_to_sibling BOOLEAN DEFAULT FALSE,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, position_id)
);
employee_position_t (Employee Position Table): Links employees to their positions with effective dates.
CREATE TABLE employee_position_t (
host_id VARCHAR(22) NOT NULL,
employee_id VARCHAR(50) NOT NULL,
position_id VARCHAR(22) NOT NULL,
position_type CHAR(1) NOT NULL, -- P position of own, D inherited from a decendant, S inherited from a sibling.
start_date DATE NOT NULL,
end_date DATE,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, employee_id, position_id),
FOREIGN KEY (host_id, position_id) REFERENCES position_t(host_id, position_id) ON DELETE CASCADE
);
Authorization Strategies
In order to link users to API endpoints for authorization, we will adpot the following approaches with a rule engine to enforce the policies in the sidecar of the API with access-control middleware handler.
A. Role-Based Access Control (RBAC)
This is a common and relatively simple approach. You define roles (e.g., "admin," "editor," "viewer") and assign permissions to those roles. Users are then assigned to one or more roles.
Role Table:
CREATE TABLE role_t (
host_id VARCHAR(22) NOT NULL,
role_id VARCHAR(22) NOT NULL,
role_name VARCHAR(255) UNIQUE NOT NULL,
description TEXT,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, role_id)
);
Role-Endpoint Permission Table:
CREATE TABLE role_permission_t (
host_id VARCHAR(32) NOT NULL,
role_id VARCHAR(32) NOT NULL,
endpoint_id VARCHAR(64) NOT NULL,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, role_id, endpoint_id),
FOREIGN KEY (host_id, role_id) REFERENCES role_t(host_id, role_id) ON DELETE CASCADE,
FOREIGN KEY (endpoint_id) REFERENCES api_endpoint_t(endpoint_id) ON DELETE CASCADE
);
Role-User Assignment Table:
CREATE TABLE role_user_t (
host_id VARCHAR(22) NOT NULL,
role_id VARCHAR(22) NOT NULL,
user_id VARCHAR(22) NOT NULL,
start_date DATE NOT NULL DEFAULT CURRENT_DATE,
end_date DATE,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, role_id, user_id, start_date),
FOREIGN KEY (user_id) REFERENCES user_t(user_id) ON DELETE CASCADE,
FOREIGN KEY (host_id, role_id) REFERENCES role_t(host_id, role_id) ON DELETE CASCADE
);
B. User-Based Access Control (UBAC)
This approach assigns permissions directly to users, allowing for very fine-grained control. It's more flexible but can become complex to manage if you have a lot of users and endpoints. It should only be used for temporary access.
User-Endpoint Permissions Table:
CREATE TABLE user_permission_t (
user_id VARCHAR(22) NOT NULL,
host_id VARCHAR(22) NOT NULL,
endpoint_id VARCHAR(22) NOT NULL,
start_date DATE NOT NULL DEFAULT CURRENT_DATE,
end_date DATE,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (user_id, host_id, endpoint_id),
FOREIGN KEY (user_id) REFERENCES user_t(user_id) ON DELETE CASCADE,
FOREIGN KEY (endpoint_id) REFERENCES api_endpoint_t(endpoint_id) ON DELETE CASCADE
);
C. Group-Based Access Control (GBAC)
You can group users into teams or departments and assign permissions to those groups. This is useful when you want to manage permissions for sets of users with similar access needs.
Groups Table:
CREATE TABLE group_t (
host_id VARCHAR(32) NOT NULL,
group_id VARCHAR(32) NOT NULL,
group_name VARCHAR(255) UNIQUE NOT NULL,
description TEXT,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, group_id)
);
Group-Endpoint Permission Table:
CREATE TABLE group_permission_t (
host_id VARCHAR(32) NOT NULL,
group_id VARCHAR(32) NOT NULL,
endpoint_id VARCHAR(32) NOT NULL,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, group_id, endpoint_id),
FOREIGN KEY (host_id, group_id) REFERENCES group_t(host_id, group_id) ON DELETE CASCADE,
FOREIGN KEY (endpoint_id) REFERENCES api_endpoint_t(endpoint_id) ON DELETE CASCADE
);
Group-User Membership Table:
CREATE TABLE group_user_t (
host_id VARCHAR(22) NOT NULL,
group_id VARCHAR(22) NOT NULL,
user_id VARCHAR(22) NOT NULL,
start_date DATE NOT NULL DEFAULT CURRENT_DATE,
end_date DATE,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, group_id, user_id, start_date),
FOREIGN KEY (user_id) REFERENCES user_t(user_id) ON DELETE CASCADE,
FOREIGN KEY (host_id, group_id) REFERENCES group_t(host_id, group_id) ON DELETE CASCADE
);
D. Attribute-Based Access Control (ABAC)
Attribute Table:
CREATE TABLE attribute_t (
host_id VARCHAR(22) NOT NULL,
attribute_id VARCHAR(22) NOT NULL,
attribute_name VARCHAR(255) UNIQUE NOT NULL, -- The name of the attribute (e.g., "department," "job_title," "project," "clearance_level," "location").
attribute_type VARCHAR(50) CHECK (attribute_type IN ('string', 'integer', 'boolean', 'date', 'float', 'list')), -- Define allowed data types
description TEXT,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, attribute_id)
);
- Attribute User Table:
CREATE TABLE attribute_user_t (
host_id VARCHAR(22) NOT NULL,
attribute_id VARCHAR(22) NOT NULL,
user_id VARCHAR(22) NOT NULL, -- References users_t
attribute_value TEXT, -- Store values as strings; you can cast later
start_date DATE NOT NULL DEFAULT CURRENT_DATE,
end_date DATE,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, attribute_id, user_id, start_date),
FOREIGN KEY (user_id) REFERENCES user_t(user_id) ON DELETE CASCADE,
FOREIGN KEY (host_id, attribute_id) REFERENCES attribute_t(host_id, attribute_id) ON DELETE CASCADE
);
- Attribute Permission Table:
CREATE TABLE attribute_permission_t (
host_id VARCHAR(32) NOT NULL,
attribute_id VARCHAR(32) NOT NULL,
endpoint_id VARCHAR(32) NOT NULL, -- References api_endpoints_t
attribute_value TEXT,
update_user VARCHAR (255) DEFAULT SESSION_USER NOT NULL,
update_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (host_id, attribute_id, endpoint_id),
FOREIGN KEY (endpoint_id) REFERENCES api_endpoint_t(endpoint_id) ON DELETE CASCADE,
FOREIGN KEY (host_id, attribute_id) REFERENCES attribute_t(host_id, attribute_id) ON DELETE CASCADE
);
How it Works:
-
Define Attributes: Define all relevant attributes in attribute_t. Think about all the properties of your users, resources, and environment that might be used in access control decisions.
-
Assign Attributes to Users: Populate user_attribute_t to associate attribute values with users.
-
Assign Attributes to Endpoints: Populate attribute_permission_t to associate attribute values with API endpoints.
-
Write Policies: Create policy rules in rule engine. These rules should use the attribute names defined in attribute_t.
-
Policy Evaluation (at runtime):
-
The policy engine receives the subject (user), resource (API endpoint), and action (HTTP method) of the request.
-
The engine retrieves the relevant attributes from the user_attribute_t and attribute_permission_t tables.
-
The engine evaluates the policy rule from the relevant policies against the attributes.
-
Based on the policy evaluation result, access is either granted or denied.
Key Advantages of ABAC:
-
Fine-Grained Control: Express very specific access rules.
-
Centralized Policy Management: Policies are stored centrally and can be easily updated.
-
Flexibility and Scalability: Adapts easily to changing requirements.
-
Auditing and Compliance: Easier to audit and demonstrate compliance.
JWT Security Claims
Using the tables defined above, follow these steps to create an authorization code token with user security claims:
-
uid
Assign theuser_id
to theuid
claim in the JWT. -
role
Include a list of roles associated with the user. -
grp
Add a list of groups the user belongs to. -
att
Include a list of key-value pairs representing user attributes.
Group Management and Dynamic Membership
1. Define Groups Related to the Organizational Structure
You can create groups that align with teams, departments, or other organizational units. These groups are relatively static and reflect the overall organizational structure. Use a separate table, group_t
, as described above to store these groups.
2. Use the Reporting Structure to Derive Dynamic Group Memberships
Instead of directly assigning all users to groups, you can:
- Assign base group memberships to individual users (e.g., only non-managers initially).
- Use the reporting relationships stored in the
report_relationship_t
table to infer additional group memberships based on the organizational hierarchy.
3. Retrieval Logic
-
Get User's Direct Groups:
Retrieve the groups a user is explicitly assigned to from theuser_group_t
table. -
Traverse Up the Reporting Hierarchy:
Use thereport_relationship_t
table to find all the user's ancestors (managers) in the reporting structure. -
Inherit Subordinate Group Memberships:
For each ancestor (manager), retrieve the direct reports' group memberships. Add these groups to the manager's effective group memberships. You can control the depth of inheritance (e.g., only inherit from direct reports or up to a certain level in the hierarchy). -
Combine and Deduplicate:
Combine the user's direct group memberships with the inherited memberships, removing any duplicates.
Example
Let's say:
-
Alice is a manager and belongs to the "Management" group.
-
Bob reports to Alice and belongs to the "Engineering" group.
-
Charlie reports to Bob and belongs to the "Engineering" and "Testing" groups.
When Bob's request comes in:
-
Query result contains Bob's direct group: "Engineering".
-
Check reporting structure: Bob reports to Alice.
-
Get group memberships of Bob's direct reports: "Engineering", "Testing". (These are inherited since Bob is Charlie's manager)
-
Bob's effective groups are now "Engineering", "Testing".
When Alice's request comes in:
-
Query result contains Alice's direct group: "Management".
-
Check reporting structure: Bob and Charlie report to Alice.
-
Get group memberships of Alice's direct reports: "Engineering", "Testing" (inherited from Bob and Charlie).
-
Alice's effective groups are now "Management", "Engineering", "Testing".
Advantages:
-
Reduced Administrative Overhead: You don't have to manually manage group memberships for managers as their teams change.
-
Dynamic Access Control: Permissions adapt automatically as the reporting structure evolves.
-
Centralized Logic: The inheritance logic is encapsulated in the logic, making it easier to maintain and update.
Implementation
Sign In
Portal Dashboard
The Portal Dashboard is served by the portal-view
single-page application.
-
Guest User Access:
Upon landing on the dashboard, a guest user can:- View certain menus.
- Perform limited actions within the application.
-
Accessing Privileged Features:
To access additional features:- Click the User button.
- Select the Sign In menu item.
Login View
-
Redirection to Login View:
When the Sign In menu item is clicked, the browser is redirected to the Login View single-page application. This application is served by the same instance oflight-gateway
and handles user authentication against the OAuth 2.0 server (OAuth Kafka) to initiate the Authorization Code grant flow. -
OAuth 2.0 Client ID:
Theclient_id
is included in the redirect URL as a query parameter. This ensures that theclient_id
is sent to the OAuth 2.0 server to obtain the authorization code. In this context, theclient_id
is associated with theportal-view
application. -
Login View Responsibilities:
The Login View is a shared single-page application used by all other SPAs across various hosts. It is responsible for:- Authenticating users.
- Ensuring that user credentials are not passed to any other single-page applications or business APIs.
-
SaaS Deployment in the Cloud:
In a SaaS environment, all users are authenticated by the OAuth 2.0 server using thelight-portal
user database. As a result, the user type does not need to be passed from the Login View. -
On-Premise Deployment:
For on-premise deployments, a customized Login View should include a radio button for selecting the user type. Typical options for most organizations are:- Employee (E)
- Customer (C)
-
Customized Authentication:
Based on the selected user type:- Employees are authenticated via Active Directory.
- Customers are authenticated using the customer database.
A customized authenticator implementation should handle this logic, ensuring the correct authentication method is invoked for each user type.
Login Form Submission
-
Form Submission Endpoint:
/oauth2/N2CMw0HGQXeLvC1wBfln2A/code
-
Request Details:
- Headers:
Content-Type
:application/x-www-form-urlencoded
- Method:
POST
- Body Parameters:
j_username
: The user's username.j_password
: The user's password.remember
: Indicates whether the session should persist.client_id
: The OAuth 2.0 client identifier.state
: A hardcoded value (requires additional work for dynamic handling).user_type
: (Optional) Specifies the type of user (e.g., employee or customer).redirect_uri
: (Optional) The URI to redirect after authentication.
- Headers:
Light Gateway
The light-gateway instance acts as a BFF and it has a routing rule to route any request with prefix /oauth2 to kafka-oauth server.
OAuth Kafka
-
LightPortalAuthenticator
A request to hybrid-query:
{"host":"lightapi.net","service":"user","action":"loginUser","version":"0.1.0","data":{"email":"%s","password":"%s"}}
User Query
- LoginUser
This handler calls loginUserByEmail method from PortalDbProviderImpl.
PortalDbProviderImpl
The input for this method is email, password and userType.
Configuration
light-gateway
Client Credentials Token
All the accesses from the light-gateway to the downstream APIs should have at least one token in the Authorization header. If there is an authorization code token in the Authorization header, then a client credentials token will be added to the X-Scope-Token header by the TokenHandler.
Since all light portal services have the same scopes (portal.r and portal.w), one token should be enough for accessing all APIs.
Add the client credentials token config in client.yml section.
# Client Credential
client.tokenCcUri: /oauth2/N2CMw0HGQXeLvC1wBfln2A/token
client.tokenCcClientId: f7d42348-c647-4efb-a52d-4c5787421e72
client.tokenCcClientSecret: f6h1FTI8Q3-7UScPZDzfXA
client.tokenCcScope:
- portal.r
- portal.w
- ref.r
- ref.w
Add TokenHandler to the handler.yml section.
# handler.yml
handler.handlers:
.
.
.
- com.networknt.router.middleware.TokenHandler@token
.
.
.
handler.chains.default:
.
.
.
- prefix
- token
- router
Add the TokenHandler configuration token.yml section.
# token.yml
token.enabled: true
token.appliedPathPrefixes:
- /r
light-reference
Cors Configuration
As the light-gateway is handling the SPA interaction and cors, we don't need to enable the cors on the reference API. However, the cors handler is still registered in the default handler.yml in case the reference API is used as a standalone service.
In the light-portal configuration, we need to disable the cors.
# cors.yml
cors.enabled: false
Client Configuration
We need to load the jwk from the oauth-kafka service to validate the incoming jwk tokens. To set up the jwk, add the following lines to the values.yml file.
# client.yml
client.tokenKeyServerUrl: https://localhost:6881
client.tokenKeyUri: /oauth2/N2CMw0HGQXeLvC1wBfln2A/keys