Datasets:

DilatoMX
/

G11n_GenAI_Prompt_Datasets

Tags:

Dataset Viewer

Split (1)

train

The dataset viewer is not available for this split.

Cannot extract the features (columns) for the split 'train' of the config 'default' of the dataset.

Error code:   FeaturesError
Exception:    UnicodeDecodeError
Message:      'utf-8' codec can't decode byte 0xe9 in position 261: invalid continuation byte
Traceback:    Traceback (most recent call last):
                File "/src/services/worker/src/worker/job_runners/split/first_rows.py", line 246, in compute_first_rows_from_streaming_response
                  iterable_dataset = iterable_dataset._resolve_features()
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/iterable_dataset.py", line 4196, in _resolve_features
                  features = _infer_features_from_batch(self.with_format(None)._head())
                                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/iterable_dataset.py", line 2533, in _head
                  return next(iter(self.iter(batch_size=n)))
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/iterable_dataset.py", line 2711, in iter
                  for key, pa_table in ex_iterable.iter_arrow():
                                       ^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/iterable_dataset.py", line 2249, in _iter_arrow
                  yield from self.ex_iterable._iter_arrow()
                File "/usr/local/lib/python3.12/site-packages/datasets/iterable_dataset.py", line 494, in _iter_arrow
                  for key, pa_table in iterator:
                                       ^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/iterable_dataset.py", line 384, in _iter_arrow
                  for key, pa_table in self.generate_tables_fn(**gen_kwags):
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/packaged_modules/csv/csv.py", line 196, in _generate_tables
                  csv_file_reader = pd.read_csv(file, iterator=True, dtype=dtype, **self.config.pd_read_csv_kwargs)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/streaming.py", line 73, in wrapper
                  return function(*args, download_config=download_config, **kwargs)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/utils/file_utils.py", line 1250, in xpandas_read_csv
                  return pd.read_csv(xopen(filepath_or_buffer, "rb", download_config=download_config), **kwargs)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
                  return _read(filepath_or_buffer, kwds)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 620, in _read
                  parser = TextFileReader(filepath_or_buffer, **kwds)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1620, in __init__
                  self._engine = self._make_engine(f, self.engine)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1898, in _make_engine
                  return mapping[engine](f, **self.options)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in __init__
                  self._reader = parsers.TextReader(src, **kwds)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "pandas/_libs/parsers.pyx", line 574, in pandas._libs.parsers.TextReader.__cinit__
                File "pandas/_libs/parsers.pyx", line 663, in pandas._libs.parsers.TextReader._get_header
                File "pandas/_libs/parsers.pyx", line 874, in pandas._libs.parsers.TextReader._tokenize_rows
                File "pandas/_libs/parsers.pyx", line 891, in pandas._libs.parsers.TextReader._check_tokenize_status
                File "pandas/_libs/parsers.pyx", line 2053, in pandas._libs.parsers.raise_parser_error
                File "<frozen codecs>", line 322, in decode
              UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 261: invalid continuation byte

Need help to make the dataset viewer work? Make sure to review how to configure the dataset viewer, and open a discussion for direct support.

🌍 GenAI G11n Prompt Evaluation Dataset

This dataset was developed to support manual evaluation of multilingual generative AI models based on coherence, translation accuracy, cultural adaptation, and multimodal consistency.

📦 Dataset Contents

The dataset includes:

Prompts Type: A list of types of prompts for testing linguistic behavior in different contexts.
Placeholders: A list of the used placeholders to localize the prompts usually from the locale's word dataset.
Locale Folder: All data related specifically to one locale.
- Assessment Categories:
  - Language & Grammar
  - Cultural Adaptation
  - Instruction & Response
  - Multimodal Consistency
- Word Dataset: Collection of culturally relevant words for that specific locale.
- Formatting: Guidelines and other aspects to evaluate beyond translation, specifics for each locale.

🧪 Purpose

This dataset enables the manual assessment of AI models on tasks such as:

Following multilingual instructions
Handling ambiguity in translation
Adapting responses to regional or cultural nuances
Generating appropriate outputs in audio, image, and code formats

✅ Evaluation Criteria

The prompts are designed to evaluate the following success criteria:

Instruction & Response Coherence
Translation Accuracy
Cultural Adaptation
Multimodal Consistency

👥 Evaluation Team

Andres Castillo – G11n QA
Edgar Castillo – G11n QA
Leslie Valles - G11n QA & Linguistic Advisor
Patricia Oceguera – Linguistic Advisor
Marcela Salgado – Review Support

📊 Evaluated AI Model Versions

V1.0.2

ChatGPT – GPT-4o
Gemini – 2.0 Flash
Copilot – GPT-4o
DeepSeek – V3

V1.2

ChatGPT – GPT-5.4
Gemini – 3.0 Flash
Copilot – GPT-5.5
DeepSeek – V3

📄 License

This dataset is released under the CC BY 4.0 License, allowing sharing, modification, and redistribution with proper attribution.

📌 Additional Notes

For detailed instructions on how to run the evaluation using this dataset, go to the model page:
G11n_GenAI_Assessment_Model under the organization Dilato Infotech Limited here on Hugging Face.

Downloads last month: 93

Total file size:

364 kB