Dataset Viewer
Duplicate
The dataset viewer is not available for this split.
Cannot extract the features (columns) for the split 'train' of the config 'default' of the dataset.
Error code:   FeaturesError
Exception:    UnicodeDecodeError
Message:      'utf-8' codec can't decode byte 0xe9 in position 261: invalid continuation byte
Traceback:    Traceback (most recent call last):
                File "/src/services/worker/src/worker/job_runners/split/first_rows.py", line 246, in compute_first_rows_from_streaming_response
                  iterable_dataset = iterable_dataset._resolve_features()
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/iterable_dataset.py", line 4196, in _resolve_features
                  features = _infer_features_from_batch(self.with_format(None)._head())
                                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/iterable_dataset.py", line 2533, in _head
                  return next(iter(self.iter(batch_size=n)))
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/iterable_dataset.py", line 2711, in iter
                  for key, pa_table in ex_iterable.iter_arrow():
                                       ^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/iterable_dataset.py", line 2249, in _iter_arrow
                  yield from self.ex_iterable._iter_arrow()
                File "/usr/local/lib/python3.12/site-packages/datasets/iterable_dataset.py", line 494, in _iter_arrow
                  for key, pa_table in iterator:
                                       ^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/iterable_dataset.py", line 384, in _iter_arrow
                  for key, pa_table in self.generate_tables_fn(**gen_kwags):
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/packaged_modules/csv/csv.py", line 196, in _generate_tables
                  csv_file_reader = pd.read_csv(file, iterator=True, dtype=dtype, **self.config.pd_read_csv_kwargs)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/streaming.py", line 73, in wrapper
                  return function(*args, download_config=download_config, **kwargs)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/utils/file_utils.py", line 1250, in xpandas_read_csv
                  return pd.read_csv(xopen(filepath_or_buffer, "rb", download_config=download_config), **kwargs)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
                  return _read(filepath_or_buffer, kwds)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 620, in _read
                  parser = TextFileReader(filepath_or_buffer, **kwds)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1620, in __init__
                  self._engine = self._make_engine(f, self.engine)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1898, in _make_engine
                  return mapping[engine](f, **self.options)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in __init__
                  self._reader = parsers.TextReader(src, **kwds)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "pandas/_libs/parsers.pyx", line 574, in pandas._libs.parsers.TextReader.__cinit__
                File "pandas/_libs/parsers.pyx", line 663, in pandas._libs.parsers.TextReader._get_header
                File "pandas/_libs/parsers.pyx", line 874, in pandas._libs.parsers.TextReader._tokenize_rows
                File "pandas/_libs/parsers.pyx", line 891, in pandas._libs.parsers.TextReader._check_tokenize_status
                File "pandas/_libs/parsers.pyx", line 2053, in pandas._libs.parsers.raise_parser_error
                File "<frozen codecs>", line 322, in decode
              UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 261: invalid continuation byte

Need help to make the dataset viewer work? Make sure to review how to configure the dataset viewer, and open a discussion for direct support.

🌍 GenAI G11n Prompt Evaluation Dataset

This dataset was developed to support manual evaluation of multilingual generative AI models based on coherence, translation accuracy, cultural adaptation, and multimodal consistency.

πŸ“¦ Dataset Contents

The dataset includes:

  • Prompts Type: A list of types of prompts for testing linguistic behavior in different contexts.
  • Placeholders: A list of the used placeholders to localize the prompts usually from the locale's word dataset.
  • Locale Folder: All data related specifically to one locale.
    • Assessment Categories:
      • Language & Grammar
      • Cultural Adaptation
      • Instruction & Response
      • Multimodal Consistency
    • Word Dataset: Collection of culturally relevant words for that specific locale.
    • Formatting: Guidelines and other aspects to evaluate beyond translation, specifics for each locale.

πŸ§ͺ Purpose

This dataset enables the manual assessment of AI models on tasks such as:

  • Following multilingual instructions
  • Handling ambiguity in translation
  • Adapting responses to regional or cultural nuances
  • Generating appropriate outputs in audio, image, and code formats

βœ… Evaluation Criteria

The prompts are designed to evaluate the following success criteria:

  • Instruction & Response Coherence
  • Translation Accuracy
  • Cultural Adaptation
  • Multimodal Consistency

πŸ‘₯ Evaluation Team

  • Andres Castillo – G11n QA
  • Edgar Castillo – G11n QA
  • Leslie Valles - G11n QA & Linguistic Advisor
  • Patricia Oceguera – Linguistic Advisor
  • Marcela Salgado – Review Support

πŸ“Š Evaluated AI Model Versions

V1.0.2

  • ChatGPT – GPT-4o
  • Gemini – 2.0 Flash
  • Copilot – GPT-4o
  • DeepSeek – V3

V1.2

  • ChatGPT – GPT-5.4
  • Gemini – 3.0 Flash
  • Copilot – GPT-5.5
  • DeepSeek – V3

πŸ“„ License

This dataset is released under the CC BY 4.0 License, allowing sharing, modification, and redistribution with proper attribution.

πŸ“Œ Additional Notes

For detailed instructions on how to run the evaluation using this dataset, go to the model page:
G11n_GenAI_Assessment_Model under the organization Dilato Infotech Limited here on Hugging Face.

Downloads last month
93