Design of a Scalable Service Broker
- Design of a Scalable Service Broker
- Thursday 1F
Convener: Alan Karp
Notes-taker(s): Alan Karp
Tags for the session - technology discussed/ideas considered:
Discussion notes, key understandings, outstanding questions, observations, and, if appropriate to this discussion: action items, next steps:
Hewlett-Packard Enterprise (HPE) has a product called Enterprise Services on Demand (ESO) that simplifies standing up large enterprise services, such as SAP Hana, and enabling external cloud services, such as Salesforce. ESO provides a catalog of services with a small number of preset configurations. Companies with contracts with HPE can order such services with a few clicks, and the service will be ready to run in tens of minutes instead of tens of days.
Other companies have such catalogs, so HPE also planned to offer differentiators, one of which was managing the Service Level Agreements (SLAs) that its enterprise customer have with their cloud providers. The problem is one of scale. HPE has several thousand such customers, each with tens of thousand to hundreds of thousands of employees, and each customer uses hundreds of services. I was given the task of designing the part of the service broker platform that mediates all request from those user to the services they use.
Conventional solutions could be used if all the services were hosted by HPE or if all requests came from inside the enterprise firewall. In the former case, a reverse proxy would guarantee that the service broker sees all requests; in the later case, a proxy running in the enterprise would do. Unfortunately, ESO has to deal with requests made to external cloud services from outside the corporate firewall.
I designed a platform with three key components. A User Proxy (UP) running in the enterprise, a Solution Broker (SB) running in HPE data centers, and a Solution Wrapper (SW) running either in the HPE data centers or in one associated with the external cloud service. This design meant that identities were relevant only outside the SB. Enterprise users would authenticate to their companies; the SW would authenticate as the users to the cloud service. The SB needed identities only for administrative users.
A user of a service would invoke it via the UP in the form of an app or a web page. The UP would retrieve a blob from the company's Active Directory (AD) or LDAP server. One component of this blob was an OAuth access token to authorize access to the SB. The UP submitted the request and the blob to the SB, which would decrypt with its private key an OAuth token authorizing access to the appropriate SW, and use it to invoke the SW, forwarding the blob. The SW would decrypt with its private key the user’s signing key from the blob, verify the signature on the request, and then decrypt the user’s login credentials from the blob and pass session requests/replies back and forth.
This approach had several advantages. There was no need for a CA in order to trust the signing keys. Neither the SB or the SWs had any permissions of their own, reducing the damage a successful attacker could do. Further, the SB and SWs were highly scalable because they were stateless, not even requiring a backing database. An employee’s access to all services could be blocked by revoking one token, a company could block access to a particular service by revoking one token, and HPE could revoke an enterprise customer’s access to all services by revoking one token.
This design was under review when the SB part of ESO was cancelled and all the people working on it, including me, were laid off. Hence, there’s no guarantee that there weren’t security issues with the design, nor is there a way to be sure of its scalability.