Skip to content

feat: Prowler Analyst #7508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 37 commits into
base: master
Choose a base branch
from
Open

Conversation

Chan9390
Copy link

Context

Feature adding Prowler Analyst.

Description

Please include a summary of the change and which issue is fixed. List any dependencies that are required for this change.

Checklist

API

  • Verify if API specs need to be regenerated.
  • Check if version updates are required (e.g., specs, Poetry, etc.).
  • Ensure new entries are added to CHANGELOG.md, if applicable.

License

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Sorry, something went wrong.

@Chan9390 Chan9390 requested review from a team as code owners April 14, 2025 15:14
@github-actions github-actions bot added component/ui component/api review-django-migrations This PR contains changes in Django migrations labels Apr 14, 2025
Copy link

@gmaster1405 gmaster1405 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apply

@prowler-cloud prowler-cloud deleted a comment from gmaster1405 Apr 21, 2025
Copy link
Member

@paabloLC paabloLC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitting the comments I have so far. I’ll continue in a bit

// Get Users Schema

const userFieldsEnum = z.enum([
"",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this empty string in all the types? Just curious, could you tell me why? Thanks

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, while experimenting with OpenAI models gpt-4o and gpt-4o-mini, if we add a type without "" as an additional element in it, the LLM model picked a random element from the type even if it wasn't required to pick (ex: filters, sort, etc). Hence added an empty string to every type where the LLM selects the option.

@Chan9390 Chan9390 requested review from a team as code owners April 21, 2025 08:41
@github-actions github-actions bot added documentation provider/aws Issues/PRs related with the AWS provider provider/gcp Issues/PRs related with the Google Cloud Platform provider github_actions Pull requests that update GitHub Actions code output/html Issues/PRs related with the HTML output format compliance Issues/PRs related with the Compliance Frameworks labels Apr 21, 2025
Comment on lines 2265 to 2277
def to_representation(self, instance):
ret = super().to_representation(instance)
# Add back the masked API key for display
ret["api_key"] = "*" * len(instance.api_key) if instance.api_key else None
return ret
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can omit the key length, the api keys can change so I think it is better to fix a value like for example 8

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, we usually avoid credentials on read serializers. If the value is going to be ***, it does not provide any useful information anyway.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main goal of this is to display the API key as ****** in Lighthouse config page.

image

@vicferpoy Because all AI interactions happen from UI, the UI needs to access OpenAI/other LLM keys (so I guess this credential should be on read serializer).

For now, implemented this:

  1. /api/v1/lighthouse-config - All configured keys (current logic only checks for OpenAI keys but later can support other LLMs)
  2. /api/v1/lighthouse-config/<UUID> - Gives the selected model and business context along with ******** just to show in UI that the AI key is configured
  3. /api/v1/lighthouse-config/<UUID>/show_key - Gives the actual key

Please let me know if there's a better way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use a query parameter then to either show the key or not. Another endpoint is not necessary.

Comment on lines 2633 to 2637
if LighthouseConfig.objects.filter(tenant_id=request.tenant_id).exists():
return Response(
{"detail": "AI configuration already exists. Use PUT to update."},
status=status.HTTP_400_BAD_REQUEST,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the serializers validate method like in provider-groups and this way specify which attribute produce the error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 5143 to 5158
response = authenticated_client.post(
"/api/v1/lighthouse-config/",
data=valid_config_payload,
content_type=API_JSON_CONTENT_TYPE,
)
assert response.status_code == status.HTTP_201_CREATED
config_id = response.json()["data"]["id"]

# Retrieve config
response = authenticated_client.get(f"/api/v1/lighthouse-config/{config_id}/")
assert response.status_code == status.HTTP_200_OK
data = response.json()["data"]
assert data["attributes"]["name"] == "Test Config"
assert data["attributes"]["api_key"] == "*" * len(
valid_config_payload["data"]["attributes"]["api_key"]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a unit test if you depend on a POST for a retrieval. Use pytest fixtures instead so you already have items to work with in the database when the test begins.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apply this principle on every other test, please.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the tests, please check. Thanks!

Comment on lines 2678 to 2681
return Response(
{"api_key": decrypted_key},
status=status.HTTP_200_OK,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not follow the JSON:API specification.

@@ -2565,3 +2572,126 @@ def get_serializer_context(self):
context = super().get_serializer_context()
context["allowed_providers"] = self.allowed_providers
return context


class LighthouseConfigViewSet(BaseRLSViewSet):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CRUD endpoints could be omitted to avoid undesired behavior if they do not need to perform any extra action. If they just manage the resource in the database, it's not necessary to define the functions and the view is left clearer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed update and destroy as they are not really doing anything. For create we need the custom LighthouseConfigSerializer serializer to have a masked key in output. Update also used the serializer but the response doesn't contain the API key field and that's okay.

Comment on lines +2613 to +2624
try:
client = openai.OpenAI(
api_key=instance.api_key_decoded,
)
models = client.models.list()
return Response(
{
"detail": "Connection successful!",
"available_models": [model.id for model in models.data],
},
status=status.HTTP_200_OK,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this connection check last more than 1 second? If so, we should delegate this to a task like the provider connection check. It would also follow the JSON:API specification.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this API method is not integrated in UI. @paabloLC is connection check needed for OpenAI or LLM keys thats configured in Prowler?

Comment on lines 2265 to 2277
def to_representation(self, instance):
ret = super().to_representation(instance)
# Add back the masked API key for display
ret["api_key"] = "*" * len(instance.api_key) if instance.api_key else None
return ret
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use a query parameter then to either show the key or not. Another endpoint is not necessary.

Comment on lines 2297 to 2305
def validate(self, attrs):
tenant_id = self.context.get("request").tenant_id
if LighthouseConfig.objects.filter(tenant_id=tenant_id).exists():
raise serializers.ValidationError(
{
"tenant_id": "AI configuration already exists for this tenant. Use PUT to update."
}
)
return super().validate(attrs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are still missing what I suggested in #7508 (comment).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not changing this as discussed on Slack thread - https://prowler-workspace.slack.com/archives/C08BGLVDJ4D/p1745935957195789

Copy link
Member

@paabloLC paabloLC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added more comments for now, thank you @Chan9390

@@ -0,0 +1,87 @@
import { apiBaseUrl, getAuthHeaders, parseStringify } from "@/lib/helper";

export const aiGetCompliancesOverview = async ({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path /prowler/ui/actions/lighthouse/compliances.ts looks good enough because there is only 1 file related to compliance.

Aside from that, I would suggest starting the function name with get, something like getLighthouseComplianceOverview , to keep it consistent with our naming conventions. What do you think about that?

By doing that, we're also removing the ai from the name.

Copy link
Member

@paabloLC paabloLC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a few more comments. Thank you, @Chan9390

fields?: string[];
filters?: Record<string, string | number | boolean | undefined>;
page?: number;
page_size?: number;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This page_size should be pageSize for consistency. Thank you

];

return (
<div className="flex h-[calc(100vh-theme(spacing.16))] min-w-0 flex-col bg-background">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the <ContentLayout> component? I don't think you need a new layout for this. Take a look at any other page.

<ContentLayout title="Configure Chatbot" icon="lucide:settings">

If so, please, remove the app/(prowler)/lighthouse/config/layout.tsx

children: React.ReactNode;
}

export default function ChatbotConfigLayout({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check my previous comment; we probably don't need this.

@@ -0,0 +1,274 @@
"use client";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't add use client to any page in the app. We're using SSR capabilities from Next.js 14. There are no pages with use client in the entire project, so let's avoid introducing it.

For references: https://nextjs.org/docs/14/app/building-your-application/rendering

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I dont know how to fix this. If I remove use client, the configured API keys and business context are not shown. Please let me know how to fix this.

@@ -0,0 +1,136 @@
"use client";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't add use client to any page in the app. We're using SSR capabilities from Next.js 14. There are no pages with use client in the entire project, so let's avoid introducing it.

For references: https://nextjs.org/docs/14/app/building-your-application/rendering

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to not use client, but somehow because of AI SDK, use client is the only solution I found.

If I removed "use client" in chat.tsx, the page errors out with the following:

 ⨯ ./node_modules/@ai-sdk/react/dist/index.mjs
Attempted import error: 'swr' does not contain a default export (imported as 'useSWR').

Import trace for requested module:
./node_modules/@ai-sdk/react/dist/index.mjs
./app/(prowler)/lighthouse/page.tsx

I don't know how to solve this. Please let me know if there's a way. Thanks!


export const getFindingsTool = tool(
async ({ page, pageSize, query, sort, filters }) => {
return await getFindings({ page, pageSize, query, sort, filters });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you have pageSize correctly


// Get Findings Schema

// const findingFieldsEnum = z.enum([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this commented code?


// Get Findings By Status

// const findingsOverviewFieldsEnum = z.enum([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this commented code?


// Get Providers Schema

// const providerFieldsEnum = z.enum([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this commened code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compliance Issues/PRs related with the Compliance Frameworks component/api component/ui documentation github_actions Pull requests that update GitHub Actions code output/html Issues/PRs related with the HTML output format provider/aws Issues/PRs related with the AWS provider provider/gcp Issues/PRs related with the Google Cloud Platform provider review-django-migrations This PR contains changes in Django migrations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants