Conversation
bce2e1d to
5557aeb
Compare
|
I really like this proposal. Just to be sure: It would be still possible to have an One use-case would be in the |
Enable authenticated and authorized app-to-app communication via GoRouter using mutual TLS (mTLS). Applications connect to a shared internal domain (apps.mtls.internal), where GoRouter validates client certificates and enforces per-route access control using a default-deny model. Key features: - Phase 1a: Domain-specific mTLS in GoRouter (validates instance identity) - Phase 1b: Authorization enforcement via allowed_sources route option - Phase 2 (optional): Egress HTTP proxy for simplified client adoption Depends on RFC-0027 (Generic Per-Route Features) for route options support.
5557aeb to
8f3900a
Compare
@silvestre I have update the RFC with: |
|
This idea is really interesting but will be possible to have communication to app containers on different ports or different protocol than http? |
Give credit to Beyhan and Max for the initial work on this RFC
This RFC currently focuses on HTTP traffic via GoRouter, but non-HTTP protocol support is an interesting future direction. Current constraints:
What would be needed for non-HTTP support:
For now, this is out of scope to keep the RFC focused and achievable. But feel free to create a follow up RFC for Non-HTTP use cases. |
| - `any: true` is mutually exclusive with `apps`, `spaces`, and `orgs` | ||
| - If `any` is not set, at least one of `apps`, `spaces`, or `orgs` must be specified (default-deny) | ||
|
|
||
| This builds on the route options framework from [RFC-0027: Generic Per-Route Features](rfc-0027-generic-per-route-features.md). Phase 1b depends on RFC-0027 being implemented first. |
There was a problem hiding this comment.
We need to be careful with what we add to the per-route features. Each and every option we set in there will be transmitted via the NATS bus every 20s (unless adjusted by the operator) from each app instance to each gorouter instance. Even a slight increase in message size can have quite noticeable effects on the overall bandwidth consumption.
One of the thoughts we had, is to only allow very simple rules. Specifically allow from instances of the same app, from all apps within a space, from all apps within an org. This could be a simple enum option allowed_scope: app / space / org (name is just for illustration purposes) which would keep the size within a predictable scope.
If we go for a more flexible option we must introduce strict limits on the number of bytes each user is allowed to add to each route.
There was a problem hiding this comment.
Thanks for raising this concern, NATS bandwidth is an important consideration.
However, I'd argue this is a generic problem that should be addressed in RFC-0027 (Generic Per-Route Features), not by limiting the functionality of individual features built on top of it. RFC-0027 already acknowledges this at line 26-27:
"Other components MAY limit the size of the map or size of keys / values for technical reasons."
The proposed allowed_scope: app | space | org enum would fundamentally change the security model:
- It only supports "relative" policies (same app/space/org as the target app)
- It doesn't support cross-boundary access (app in org A calling app in org B)
- It doesn't support specific allowlisting ("only apps X, Y, Z can call me")
These cross-boundary and specific-app use cases are core to what makes this feature valuable. For example, a shared service in one org needs to accept calls from apps in multiple other orgs - the enum approach can't express this.
Regarding limits: I agree that some limit on route options size makes sense, but it should be a single global limit on the total route options size, configured by the operator. This keeps the concern in RFC-0027 where it belongs, rather than requiring each individual route option to implement its own size restrictions.
Operators already tune bandwidth-related settings like route_emitting_interval based on their deployment characteristics. A configurable max size for route options would follow the same pattern - operators can balance flexibility vs bandwidth based on their specific needs.
Proposed path forward:
- Add a note to RFC-0027 specifying an operator-configurable global size limit for route options
- Keep the flexible
allowed_sourcesdesign in this RFC
@maxmoehl shall I take a stab at creating a PR for the global route options size limit, or do you have fundamental concerns with this approach?
There was a problem hiding this comment.
Sounds good to me! The adjustment to RFC-0027 should target CC so that the user gets immediate feedback instead of some internal routing component which silently fails.
|
First, I really like the idea behind this RFC. I have a unique constraint where I need a fine grained access control at the org level on whether app-to-app mtls communications are allowed. For instance, at the platform layer, I need to enforce app-to-app mtls between organizations is not allowed, but within a space it would be, meaning you would need to be a Space Developer in both spaces. |
Implementation UpdateDraft PRs implementing Phase 1 (1a + 1b):
Just a note about the PRs, I have not yet reviewed them myself, just wanted to get something functional. Tested end-to-end on BOSH-Lite. Finding: Route Options FormatRFC-0027 doesn't allow nested objects/arrays in route options. We adapted to a flat format: // Instead of nested mtls_allowed_sources: {apps: [...]}
{"mtls_allowed_apps": "guid1,guid2", "mtls_allowed_spaces": "space-guid", "mtls_allow_any": true}Should the RFC be updated to reflect this, or should RFC-0027 be extended? Open Issue: Application Security GroupsApps need to reach GoRouter on port 443, but default ASGs block internal IPs. Currently requires manual security group creation with router IPs. Proposal: Auto-manage ASG via BOSH link when feature flag is enabled. This is not blocking (manual workaround exists) but improves operator experience. |
|
Also recorded a demo here: https://asciinema.org/a/zLXrO9ERP3lXqGuM, but this was before refactoring to flat options (still uses the nested structure, which is why cf curl is being used). |
a28cef8 to
a8b4db1
Compare
0283b49 to
50ee8a5
Compare
Add a new Size Limits section specifying that Cloud Controller must enforce a configurable maximum size (default: 1 KB) for route options to prevent excessive NATS bandwidth consumption. - Default limit: 1024 bytes - Configurable via cc.max_route_options_size BOSH property - CC returns HTTP 422 when limit is exceeded - Documents relationship with route emit interval for tuning This addresses feedback from the App-to-App mTLS RFC (PR #1438) where concerns were raised about NATS bandwidth impact of per-route options.
Summary
This RFC proposes enabling authenticated and authorized app-to-app communication via GoRouter using mutual TLS (mTLS).
View the full RFC
Applications connect to a shared internal domain (
apps.mtls.internal), where GoRouter:allowed_sourcesKey Points
allowed_sourcesapps.mtls.internalis a separate domain tree, avoiding conflicts with existingapps.internalroutesallowed_sourcessupportImplementation Phases
allowed_sources(co-requisite with 1a)cc @cloudfoundry/toc @cloudfoundry/wg-app-runtime-interfaces