Reviewing the Terms & Conditions of popular generative AI tools | by David Serrault | Feb, 2024

[ad_1]

The integration of AI into design unlocks unprecedented opportunities however, its commercial usage raises several important questions and potential pitfalls.

Designers must get an understanding of the ethical and legal matters related to this disruptive technology.

Generative AI models are built through training on massive datasets. In the case of generative image AI, these consist of classified and labeled images that are supplied to algorithms so they can learn their characteristics and subsequently generate visuals that meet all possible requests from users. This data used for training is referred to as “training data.” The learning process employed by the computational model is called “deep learning.”

Image generation models do not store actual images in their memory; instead, they only retain typical features which they store as dimensions within a kind of database called “latent space.”

Image generation tools typically rely on an algorithm called “diffusion” that extracts features of an image described in a prompt, iteratively, starting from a random noise. This process can be compared to how a sculptor brings forth a shape from a raw block of stone.

Vox published one year ago a straightforward explanation of how AI-art works. And if you want to get a better understanding of “diffusion” algorithms I recommend checking this video by Computerphile.

Tracing back the source of the images used to train models is often challenging. Were they part of the public domain? Had authors granted permission for such use? Moreover, some companies have implemented sophisticated strategies to collect them at low cost, further complicating traceability.

Stability.ai website screenshot
stability.ai/stable-image

Stability.AI serves as a prime example of problematic strategies related to AI image generation models. Stability.ai is a private company offering paid services and APIs for generating images. Among its offerings are those based on Stable Diffusion, a widely known open-source image generation model developed partly by Stability.ai.

As of 2022, the dataset used for initial training contained approximately 2.3 billion classified and tagged images according to an estimate.

How were these images collected?

Stable Diffusion is an open-source project made possible thanks to LAION, a nonprofit organization providing access to extensive databases of categorized and tagged images.

To build its collection, LAION relied on another nonprofit organization named Common Crawl. Its goal is to crawl and archive content from billions of Web pages, including those hosted on platforms like Pinterest, WordPress blogs, Flickr, Tumblr, Getty Images, iStockPhoto…

Consequently, there’s a high probability that most images utilized during Stable Diffusion’s training phase fall under copyright protection. However, since Stability.ai uses open-source models supported by free data provisions from nonprofits, no payments were issued for these images.

This method of developing specific AI models faces criticism and has led to legal disputes between intellectual property holders and businesses supplying AI image generation services.

Therefore, exercising caution when using generative AI services becomes crucial. One should verify whether it’s feasible to determine the datasets used to train the underlying models and what guarantees the enterprise offers regarding rights management.

Most of the generative AI services offer very limited protections to their customers facing potential copyright infringement lawsuits. When they provide any…

Midjourney

By using Midjourney, you retain ownership of your creations subject to specific constraints:

  • If you work for or represent a business with annual revenue exceeding $1,000,000, you must sign up for a Corporate account.
  • Utilizing Midjourney grants the platform the right to utilize everything created within it. Consequently, other users may gain indirect access to results generated via your prompts and could potentially incorporate them into additional outcomes.

Firefly (license excerpt, December 23)

For commercial use: “Features of Firefly that are no longer in beta testing can now be used for commercial purposes. Others still in beta testing remain intended for personal use only. Please ensure each feature status before using Firefly applications…” “Firefly is trained on hundreds of millions of high-resolution Adobe Stock images along with openly licensed and public-domain content designed for creating safely usable content for commercial purposes.

Adobe positions its generative AI as safer for commercial purposes, yet it might deliver less performance on particular image categories due to more limited training corpora.

Potential pitfalls: bias and opaqueness

When designing AI-based tools and services, it’s essential to be aware of potential ethical risks that can arise due to their inherent nature. Here are some common issues found in AI applications and their possible origins.

Case 1: Facial recognition

Problem: Initial iterations have shown higher error rates for specific ethnicities compared to others.

Cause: The models were based on unbalanced, incomplete, or unrepresentative reference image datasets, leading to reduced accuracy when identifying individuals from underrepresented ethnic backgrounds within training data.

Case 2: AI in the hiring process

Problem: An AI system designed to select top candidates for technical roles at a major tech company favored male applicants over females.

Cause: This firm relied solely on its recruitment history to train its model, perpetuating existing gender disparity as they had previously hired more men than women for such positions; thus amplifying pre-existing biases present in provided data during the learning process.

Case 3: AI applied to legal systems

Problem: Researchers claimed an ability to determine criminal propensity through facial image analysis — raising serious concerns regarding privacy and discrimination.

Cause: Apart from questionable societal relevance, implicit biases existed in the dataset used for training. Photographs taken in jail for criminals contrasted sharply with other subjects, introducing confounding factors like setting, clothing, and expression affecting the final output.

Models cards: a transparent communication for AI products

A growing number of advanced companies dealing with AI service creation have adopted “model cards” — a best practice aimed at promoting responsibility and transparency. These documents help anticipate potential bias, inequality, and defects before deployment by providing users with valuable information about how each model operates and what limitations to expect.

Model cards typically include six sections:

  1. Purpose of the model: An explanation of the intended function, problem-solving context, and targeted usage scenarios.
  2. Performance details: Quantitative and qualitative insights into model performance levels.
  3. Training data information: Details on data sources, size, and applied preprocessing steps.
  4. Known biases list: Identification of any known potential biases within training data sets and their impact on overall performance.
  5. Usage Limitations: Warnings concerning scenarios where the model may not perform accurately, as well as circumstances involving ethical or legal concerns related to use.
  6. Training methodology: Description of machine learning techniques employed while developing the model.

Exemples of model cards with google cloud services.

The European AI Act

The EU AI Act establishes principles governing products and services using artificial intelligence. Its goal is to ensure AI systems are: safe, transparent, traceable, non-discriminatory, and environmentally friendly. It classifies AI systems into four risk categories, each with corresponding rules and constraints:

❌ Unacceptable Risk: AI systems that manipulate vulnerable persons, produce social scores, conduct real-time remote biometric identification in public spaces*, and more (*except for law enforcement purposes) are prohibited.

⚠️ High Risk: AI systems used in critical infrastructure, medical devices, employment, public services, law enforcement, border control, etc., are controlled and must meet these requirements: Perform impact assessments on fundamental rights, allow citizens to file complaints and receive explanations on high-risk AI decisions affecting them, register high-risk AI systems in a European database.

ℹ️ Limited Risk: Comply with minimum standards — such as transparency, warning users of risks, publishing lists of protected copyrighted materials used for training models — for AI systems like chatbots and emotion recognition technology.

✅ Low or Minimal Risk: No mandatory regulations apply, although businesses might adopt additional guidelines for AI recommendation systems and spam filters.

Final Draft of the Artificial Intelligence Act as of 21st January 2024

France, an example of copyright protection

Many countries in the world can afford author protections through copyright legislation. Among them, France’s copyright laws are renowned for being particularly robust.

Protections of the author’s rights

  1. French copyright law automatically protects original works upon creation without registration until 70 years after the author’s death. Afterward, it enters the public domain, except for the author’s “moral rights”, which remain valid forever.
  2. Short quotations aren’t usually allowed for graphic or plastic art in France. Reproducing entire pieces violates copyright regardless of medium.
  3. Appropriation or fair use isn’t recognized by French courts. Creating derivative works requires permission from the original work’s creator.
  4. There is no urban legend such as a 70% similarity rule. Similarity evaluations depend on judges’ interpretation.

In general, copying another person’s work may lead to liability for unfair competition, even if it doesn’t infringe copyright directly.

However, inspiration is not an infringement

  1. Ideas and concepts aren’t covered by copyright laws. Only tangible, unique creations qualify for protection.
  2. Coincidental similarities between distinct works may occur, especially if both draw inspiration from shared cultural resources like folklore, symbols, and mythologies.
  3. Parody is an acceptable defense against copyright claims as long as there’s no intentional harm, confusion, or brand interference involved. However, exceptions could apply depending on freedom of speech considerations.

Mitigate the risks in the commercial use of generative AI creations until further clarified regulation covers this area.

  1. Review the “terms & conditions” of AI services you are using. Opt for working with companies offering clear communication about their data sources.
  2. Avoid invoking names of authors or works still under copyright in your prompts. Ensure proper permissions from creators if necessary.
  3. Refrain from referencing living celebrities or depicting individuals without their consent via detailed descriptions in prompts.
  4. Verify whether highly similar outputs (e.g. generated images) exist through search engines.
  5. Add personal touches to generative AI outcomes.
  6. Save initial prompts and assets to demonstrate authenticity and showcase genuine creativity should questions arise regarding copying or infringing on IP holders’ rights.

[ad_2]

Source link

2023. All Rights Reserved.