Metadata-Version: 2.1
Name: code-rl
Version: 0.0.8
Summary: Code RL
Home-page: https://github.com/dheerajmpai/code-rl
Author: Dheeraj Pai
Author-email: dheerajmpaicmu@gmail.com
License: MIT
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
Requires-Dist: gym
Requires-Dist: intercode-bench ==0.1.22
Requires-Dist: langchain ==0.1.3
Requires-Dist: trl ==0.7.10
Requires-Dist: pwntools ==4.11.1
Requires-Dist: faiss-cpu ==1.7.4
Requires-Dist: pymetasploit3 ==1.0.5
Requires-Dist: sqlmap ==1.8
Requires-Dist: llama-index ==0.9.37.post1
Requires-Dist: transformers

# code-rl: Reinforcement Learning for Code Generation

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1WQAKMmZYJjseBenRAlQseJu3auvBHwe4)
![GitHub stars](https://img.shields.io/github/stars/dheerajmpai/code-rl?style=social)
![GitHub forks](https://img.shields.io/github/forks/dheerajmpai/code-rl?style=social)
[![Downloads](https://static.pepy.tech/badge/code-rl)](https://pepy.tech/project/code-rl)

## Overview

`code-rl` is a Python package designed to train language models in code generation through reinforcement learning methods. It seamlessly integrates with the OpenAI Gym environment, offering a structured framework to assess the quality of code generated by these models. As of version 0.0.3, `code-rl` exclusively supports the C programming language. Future iterations of the package aim to expand its capabilities to include support for Java and Golang.

## Installation

To install `code-rl`, run the following command:

```bash
pip install code-rl
```

Ensure that you have Python 3.x installed before installation.

## Getting Started

### Prerequisites

System support: Linux and Mac.

You need to have gcc 6+ or clang 11+ preinstalled


### Basic Usage

Provide a simple example of how to use your package. For example:

```python
from coderl import CodeCompilerEnv
# dummy model
def model(prompt):
    return "int main(){return 0;}"
env = CodeCompilerEnv()
# Example of running a training episode
observation = "Give me a code to print hello world"
for _ in range(1000):
    action = model(observation)
    observation, reward, done, info = env.step(action)
    if done:
        break
    # Update your prompt based on the stderr/stdout provided at info
print(reward)
print(info)
```

## API Reference


- `CodeCompilerEnv`: The Gym environment for code evaluation.
  - `reset()`: Resets the environment to its initial state.
  - `step(action)`: Executes an action in the environment.


## Examples

```python
from coderl import CodeCompilerEnv
env = CodeCompilerEnv()
code = """
#include<stdio.h>
int main(){
    printf("Hello World");
    return 0;
    }"""
result = env.step(code)
print(result) # (1, 1, True, {'stdout': 'Hello World'})
```

```python
code = """
#include<stdio.h>
struct Point {
    int x, y;
};
int main(){
    if (1){
        printf("Hello World");
    }
    struct Point p = { 1 };  // Not initializing y will throw error in -Wextra
    printf("%d", p.x );
    return 0;
    }"""
result = env.step(code)
print(result) # (0, -1, True, {'stderr': 'temp_code.c: ...'})
```

```python
code = """
#include<stdio.h>
int main(){
    int x; // unused variable -Wall will trigger warning
    printf("Hello World");
    return 0;
    }"""
result = env.step(code)
print(result) # (0, -2, True, {'stderr': 'temp_code.c:4:9: error: unused variable ‘x’  ...'})
```

```python
code = """
#include<stdio.h>
void foo(void){
    return;
    }

int main(){
    printf("Hello World");
    foo();
    }
"""
result = env.step(code)
print(result) # (0, -2, True, {'stderr': 'temp_code.c:4:9: error: unused variable ‘x’  ...'})
```


## Contributing


We warmly welcome contributions to `code-rl` and value your efforts to improve and expand this package. If you're interested in contributing, please start by forking the repository and submitting your changes through a pull request. We encourage you to adhere to established Python coding standards (PEP 8) for consistency. When submitting a pull request, please provide a clear description of the changes and any relevant issue numbers. We also recommend adding tests for new features to ensure reliability. For substantial changes, please open an issue first to discuss what you would like to change. Your contributions play a significant role in the development of `code-rl`, and we look forward to collaborating with the community!

## License

MIT License

