Skip to content

Conversation

@pitangainnovare
Copy link
Contributor

O que esse PR faz?

Criação do app xml_manager, responsável por gerenciar o envio e processamento de arquivos XML. Ele recebe um arquivo, valida sua estrutura e conteúdo, e dispara processos de conversão (PDF).

Onde a revisão poderia começar?

No arquivo xml_manager/views.py, que contém a lógica principal de recebimento e disparo do processamento.

Como este poderia ser testado manualmente?

  1. Subir o projeto.
  2. Acessar a URL de envio de XML.
  3. Enviar um XML válido.
  4. Conferir a listagem no template e verificar se as tasks de validação e conversão foram disparadas.

Algum cenário de contexto que queira dar?

Essa é a primeira versão do módulo, servindo como base para evoluções futuras. Algumas partes ainda estão pendentes (modelo e task de PDF).

Screenshots

N/A.

Quais são tickets relevantes?

Sem vínculo com issues até o momento.

Referências

  • Código utilitário reaproveitado de outro repositório.

This comment was marked as outdated.

Remove anotação que desabilitava proteção CSRF

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@@ -0,0 +1,94 @@
import logging
Copy link
Member

@robertatakenaka robertatakenaka Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pitangainnovare a pasta já se chama utils, obtenha o requester.py do scms-upload para as funções de requests
https://github.com/scieloorg/scms-upload/tree/main/core/utils

null=True,
verbose_name=_("Exceptions File")
)
pdf_file = models.FileField(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Crie um modelo, em que tenha idioma. Além disso, cada XML tem que ter 0 ou mais PDF

null=True,
verbose_name=_("PDF File")
)
html_file = models.FileField(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Crie um modelo, em que tenha idioma. Além disso, cada XML tem que ter 0 ou mais HTML


@celery_app.task(bind=True, name=_('Process XML Document'), timelimit=-1)
def task_process_xml_document(self, xml_id, user_id=None, username=None):
user = _get_user(self.request, username=username, user_id=user_id)
Copy link
Member

@robertatakenaka robertatakenaka Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pitangainnovare Esta linha pode ser após o bloco try/except. E como está definindo o user aqui, então ...

return False

logging.info(f'Processing XML file {xml_document.xml_file.name}.')
task_validate_xml_file.delay(xml_id, user_id=user_id, username=username)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pitangainnovare então aqui use user.id e não precisará de username_username

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nos outros projetos (usage, upload, core) adotamos essa abordagem de informar ou user_id ou username. Isso possibilita chamar as coisas via periodic_task na interface passando username e via outras formas usando user_id.

]


class XMLDocumentSnippetViewSet(SnippetViewSet):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pitangainnovare coloque View em views.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isso é uma classe do Wagtail - é o SnippetViewSet (aquele que substitui "WagtailAdminView" quando subimos o wagtail para versões mais novas).



@hooks.register('register_admin_urls')
def register_admin_urls():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pitangainnovare isso deveria ficar em urls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As funções com @hooks.register precisam ficar em wagtail_hooks.py, porque o Wagtail carrega esse arquivo automaticamente quando inicia.

"hooks" são uma forma de extender o wagtail.

Copy link
Member

@robertatakenaka robertatakenaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pitangainnovare a mudança realmente necessária é a modelagem que permita associar 1 XML a vários PDF e HTML

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new Django app called xml_manager for handling XML file processing and conversion. The app provides functionality to upload, validate, and convert XML files to PDF and HTML formats using asynchronous task processing.

  • Creation of complete XML manager app with models for XML documents and their generated outputs
  • Implementation of async task processing for validation and file conversion workflows
  • Integration with Wagtail admin interface for file management and processing controls

Reviewed Changes

Copilot reviewed 28 out of 32 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
xml_manager/models.py Defines core models for XML documents and their PDF/HTML derivatives
xml_manager/tasks.py Implements Celery tasks for XML validation and conversion processes
xml_manager/utils.py Contains utility functions for XML validation and PDF/HTML generation
xml_manager/wagtail_hooks.py Configures Wagtail admin interface and custom table columns
xml_manager/views.py Provides view for triggering XML processing tasks
tracker/models.py Adds XMLDocumentEvent model and refactors existing event tracking
requirements/base.txt Adds packtools and langdetect dependencies
compose/*/django/Dockerfile Installs LibreOffice and fonts for document conversion

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

message=str(e),
save=True,
)
return False
Copy link

Copilot AI Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The task_generate_html_file function is missing a return statement for the successful execution path. After line 192 (html_instance.save()), the function should return True to indicate successful completion.

Copilot uses AI. Check for mistakes.
os.makedirs(output_root_dir)

# ToDo: Implement HTML generation logic here
return No newline at end of file
Copy link

Copilot AI Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generate_html_for_xml_document function returns None but the calling code in tasks.py expects it to return path_html and lang values. This will cause an unpacking error when the task tries to assign the return values.

Suggested change
return
return None, None

Copilot uses AI. Check for mistakes.
message=str(e),
save=True,
)
except exceptions.XML_File_HTML_Generation_Error as e:
Copy link

Copilot AI Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception class XML_File_HTML_Generation_Error is referenced but not defined in the xml_manager.exceptions module. This will cause a NameError when this exception path is executed.

Copilot uses AI. Check for mistakes.

user = _get_user(self.request, username=username, user_id=user_id)

params = {'libreoffice_binary': libreoffice_binary,}
Copy link

Copilot AI Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] There's an unnecessary trailing comma after the dictionary value. Consider removing it for cleaner code formatting.

Suggested change
params = {'libreoffice_binary': libreoffice_binary,}
params = {'libreoffice_binary': libreoffice_binary}

Copilot uses AI. Check for mistakes.

list_display = (
"xml_document",
LinkColumn("pdf_file", "PDF file"),
Copy link

Copilot AI Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LinkColumn constructor is missing the 'label' parameter name. It should be LinkColumn("pdf_file", label="PDF file") to match the pattern used in other list_display entries.

Suggested change
LinkColumn("pdf_file", "PDF file"),
LinkColumn("pdf_file", label="PDF file"),

Copilot uses AI. Check for mistakes.
list_display = (
"xml_document",
LinkColumn("pdf_file", "PDF file"),
LinkColumn("docx_file", "DOCX file"),
Copy link

Copilot AI Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LinkColumn constructor is missing the 'label' parameter name. It should be LinkColumn("docx_file", label="DOCX file") to match the pattern used in other list_display entries.

Suggested change
LinkColumn("docx_file", "DOCX file"),
LinkColumn("docx_file", label="DOCX file"),

Copilot uses AI. Check for mistakes.

list_display = (
"xml_document",
LinkColumn("html_file", "HTML file"),
Copy link

Copilot AI Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LinkColumn constructor is missing the 'label' parameter name. It should be LinkColumn("html_file", label="HTML file") to match the pattern used in other list_display entries.

Suggested change
LinkColumn("html_file", "HTML file"),
LinkColumn("html_file", label="HTML file"),

Copilot uses AI. Check for mistakes.
if not os.path.exists(output_root_dir):
os.makedirs(output_root_dir)

# ToDo: Implement HTML generation logic here
Copy link

Copilot AI Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Use standard spelling 'TODO' instead of 'ToDo' for consistency with common development conventions.

Suggested change
# ToDo: Implement HTML generation logic here
# TODO: Implement HTML generation logic here

Copilot uses AI. Check for mistakes.
@robertatakenaka robertatakenaka merged commit b65657a into scieloorg:main Sep 1, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants